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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
5 polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 

10 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, fc*F 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
3 0 cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
35 polynucleotides and cells genetically engineered to express such polynucleotides. 
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The present invention relates to a collection or library of at least one novel nucleic acid 

sequence assembled from expressed sequence tags (ESTs) isolated mainly by sequencing by 

hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 

The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 

5 diagnostic and research utilities for these polynucleotides and proteins. These nucleic acid 

sequences are designated as SEQ ID NO: 1-1 786 and 3573-5358. The polypeptides sequences are 

designated SEQ ID NO: 2n (wherein n = 1 to 20). The nucleic acids and polypeptides are provided 

in the Sequence Listing. In the nucleic acids provided in the Sequence Listing, A is adenosine; C is 

cytosine; G is guanine; T is thymine; and N is any of the four bases. In the amino acids provided in 

1 0 the Sequence Listing, * corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO:l-1786 and 3573-5358 under stringent hybridization 
conditions; nucleic acid sequences which are allelic variants or species homologues of any of the 
nucleic acid sequences recited above, or nucleic acid sequences that encode a peptide comprising a 

15 specific domain or truncation ofthe peptides encoded by SEQ ID NO:l-1786 and 3573-5358. A 
polynucleotide comprising a nucleotide sequence having at least 90% identity to an identifying 
sequenceof SEQ IDNO:l-1786 and 3573-5358 or a degenerate variant or fragment thereof. The 
identifying sequence can be 1 00 base pairs in length. 

The nucleic acid sequences ofthe present invention also include the sequence information 

20 from the nucleic acid sequences of SEQ ID NO: 1-1786 and 3573-5358 . The sequence information 
can be a segment of any one of SEQ ID NO:l-1786 and 3573-5358 that uniquely identifies or 
represents the sequence information of SEQ ID NO:l-1786 and 3573-5358. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 

25 a nucleic acid array . In one embodiment, segments of sequence information is provided on a 

nucleic acid array to detect the polynucleotide that contains the segment. The array can be designed 
to detect full-match or mismatch to the polynucleotide that contains the segment. The collection 
can also be provided in a computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic acid 

30 sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an array, use in computer-readablemedia, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1-1 786 and 3573- 
5358 or novel segments or parts of the nucleic acids of the invention are used as primers in 
5 expression assays that are well known in the art. In a particularly preferred embodiment, the nucleic 
acid sequences of SEQ ID NO: 1 -1 786 and 3573-5358 or novel segments or parts of the nucleic 
acids provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrathet al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

1 0 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1-1786 and 
3573-5358; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO:l-1786 and 3573-5358; and a polynucleotide comprising any of the nucleotide sequences of the 
mature protein coding sequences of SEQ ID NO: 1 - 1 786 and 3573-5358. The polynucleotides of the 

1 5 present invention also include, but are not limited to, a polynucleotide that hybridizes under 

stringent hybridization conditions to (a) the complement of any one of the nucleotide sequences set 
forth in SEQ ID NO: 1-1786 and 3573-5358; (b) a nucleotide sequence encoding any one of the 
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotides recited above; (d) a polynucleotide which encodes a species homolog 

20 (e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a 

polypeptide comprising a specific domain or truncation of any of the polypeptides comprising an 
amino acid sequence set forth in the Sequence Listing. 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 

25 full length or mature protein. Polypeptides of the invention also include polypeptides with biological 
activity that are encoded by (a) any of the polynucleotides having a nucleotide sequence set forth in 
SEQ ID NO:l-1786 and 3573-5358; or (b) polynucleotides that hybridize to the complement of the 
polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 

30 equivalents" thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
amino acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides of the invention may be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) of the 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 

Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 

hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 

5 the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 

10 protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 

1 5 or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 

20 expressed sequence tags for identifying expressed genes or, as well known in the art and 

exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for physical 
mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 

25 of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 

30 which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 

3 5 expression or biological activity. 
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The present invention further relates to methods for detecting the presence of the 

polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 

utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 

identification of subjects exhibiting a predisposition to such conditions. The invention provides 

5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 

the sample with a compound that binds to and forms a complex with the polynucleotide of 

interest for a period sufficient to form the complex and under conditions sufficient to form a 

complex and detecting the complex such that if a complex is detected, the polynucleotide of 

interest is detected. The invention also provides a method for detecting the polypeptides of the 

10 invention in a sample comprising contacting the sample with a compound that binds to and forms 

a complex with the polypeptide under conditions and for a period sufficient to form the complex 

and detecting the formation of the complex such that if a complex is formed, the polypeptide is 

detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
1 5 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 
20 (i.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide/compound 
complex, wherein the complex drives expression of a reporter gene sequence in the cell; and 
detecting the complex by detecting the reporter gene sequence expression such that if expression 
of the reporter gene is detected the compound the binds to a polypeptide of the invention is 
30 identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
35 modulate the overall activity of the target gene products. Compounds and other substances can 
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effect such modulation either on the level of target gene/protein expression or target protein 

activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2); for which they have a 
signature region (as set forth in Table 3); or for which they have homology to a gene family (as 
set forth in Table 4). If no homology is set forth for a sequence, then the polypeptides and 
polynucleotides of the present invention are useful for a variety of applications, as described 
herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 



4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms "a", 
15 "an" and "the" include plural references unless the context clearly dictates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
20 Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
25 enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence S'-TCA-S'. Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
30 total complementarity exists between the single stranded molecules. The degree of 

complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
35 stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 

6 
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and continuous source of germ cells for the production of gametes. The term "primordial germ 

cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 

from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 

differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 

5 are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 

not only populate the germ line and give rise to a plurality of terminally differentiated cells that 

comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 

modulates the expression of an operably linked ORF or another EMF. 

1 0 As used herein, a sequence is said to "modulate the expression of an operably linked 

sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 

15 The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic . 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like material. In the 

20 sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 

25 acid which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 

30 more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 1 7 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 

35 preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
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nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 

be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 

procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 

fragment or segment may uniquely identify each polynucleotide sequence of the present 

5 invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 

IDNOs:l-20. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1:241-250). They may 

1 0 be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al., 1989, Molecular Cloning; A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 

1 5 entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO:l-1786 and 3573-5358. The 
sequence information can be a segment of any one of SEQ ID NO:l-1786 and 3573-5358 that 
uniquely identifies or represents the sequence information of that sequence of SEQ ID NO:l- 

20 1786 and 3573-5358. One such segment can be a twenty-mer nucleic acid sequence because the 
probability that a twenty-mer is fully matched in the human genome is 1 in 300. In the human 
genome, there are three billion base pairs in one set of chromosomes. Because 4 20 possible 
twenty-mers exist, there are 300 times more twenty -mers than there are base pairs in a set of 
human chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 

25 matched in the human genome is approximately 1 in 5. When these segments are used in arrays 
for expression studies, fifteen-mer segments can be used.- The probability that the fifteen-mer is 
fully matched in the expressed sequences is also approximately one in five because expressed 
sequences comprise less than approximately 5% of the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 

30 be a twenty-five mer . The probability that the twenty-five mer would appear in a human genome 
with a single mismatch is calculated by multiplying the probability for a full match (1 -f-4 25 ) times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 

3 5 detected in a human genome is approximately one in G ve. 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 

amino acids without any termination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 

acid sequences. For example, a promoter is operably associated or operably linked with a coding 

5 sequence if the promoter controls the transcription of the coding sequence. While operably 

linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 

elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 

transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 

10 differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 

1 5 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 

20 length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

25 The term "translated protein coding portion" means a sequence which encodes for the full 

length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 

30 produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 



9 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 

ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 

attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 

substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 

5 in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 

occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 

recombinant DNA techniques. Guidance in determining which amino acid residues may be 

replaced, added or deleted without abolishing activities of interest, may be found by comparing 

10 the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 

15 substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 

20 affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 

25 nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 

30 "deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 

amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 
Alternatively, where alteration of function is desired, insertions, deletions or 

35 non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 

the polypeptides of the invention. For example, such alterations may change polypeptide 

characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 

rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 

5 for expression, scale up and the like in the host cells chosen for expression. For example, 

cysteine residues can be deleted or substituted with another amino acid residue in order to 

eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 

nucleic acid or polypeptide is present in the substantial absence of other biological 

10 macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 

polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

15 The term "isolated' 1 as used herein refers to a nucleic acid or polypeptide separated from 

at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 

20 polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 

25 defines a polypeptide or protein essentially free of native endogenous substances and 

unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

30 The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 

or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 

35 appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 

extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 

protein is expressed without a leader or transport sequence, it may include an amino terminal 

methionine residue. This residue may or may not be subsequently cleaved from the expressed 

5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 

a recombinant transcriptional unit into chromosomal DN A or carry the recombinant 

transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 

express heterologous polypeptides or proteins upon induction of the regulatory elements linked 

10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 

1 5 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is expressed 
in a suitable host cell. "Secreted" proteins include without limitation proteins secreted wholly 
(e.g., soluble proteins) or partially (e.g., receptors) from the cell in which they are expressed. 

20 "Secreted" proteins also include without limitation proteins that are transported across the 
membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to include 
proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, P. A. and 
Young, P.R. (1992) Cytokine 4(2): 134 -143) and factors released from damaged cells (e.g. 
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev. Immunol. 

25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

30 The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions (i.e., hybridization 
to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization conditions are 

35 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 

hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 

14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 

60°C (for 23-base oligonucleotides). 

5 As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

sequences, for example a mutant sequence, that varies from a reference sequence by one or more 

substitutions, deletions, or additions, the net effect of which does not result in an adverse 

functional dissimilarity between the reference and subject sequences. Typically, such a 

substantially equivalent sequence varies from one of those listed herein by no more than about 

10 35% (i.e., the number of individual residue substitutions, additions, and/or deletions in a 

substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 

15 listed sequence by no more than 30% (70% sequence identity); in a variation of this embodiment, 
by no more than 25% (75% sequence identity); and in a further variation of this embodiment, by 
no more than 20% (80% sequence identity) and in a further variation of this embodiment, by no 
more than 1 0% (90% sequence identity) and in a further variation of this embodiment, by no 
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 

20 sequences according to the invention preferably have at least 80% sequence identity with a listed 
amino acid sequence, more preferably at least 90% sequence identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, and most 

25 preferably at least about 95% identity. For the purposes of the present invention, sequences 
having substantially equivalent biological activity and substantially equivalent expression 
characteristics are considered substantially equivalent. For the purposes of determining 
equivalence, truncation of the mature sequence (e.g., via a mutation which creates a spurious 
stop codon) should be disregarded. Sequence identity may be determined, e.g., using the Jotun 

30 Hein method (Hein, J. (1 990) Methods Enzymol. 1 83 :626-645). Identity between sequences can 
also be determined by other methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. 

The term "transformation" means introducing DNA into a suitable host cell so that the 

35 DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
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term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 

or not any coding sequences are in fact expressed. The term "infection" refers to the introduction 

of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 

5 which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 

using known UMFs as a target sequence or target motif with the computer-based systems 

described below. The presence and activity of a UMF can be confirmed by attaching the 

suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 

with an appropriate host under appropriate conditions and the uptake of the marker sequence is 

10 determined. As described above, a UMF will increase the frequency of uptake of a linked 

marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
context dictates otherwise. 

1 5 4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences of SEQ ID NO:l-1786 and 3573-5358 ; a polynucleotide encoding any one 
of the peptide sequences of SEQ ID NO:1787-3572 and 5359-7144; and a polynucleotide 

20 comprising the nucleotide sequence encoding the mature protein coding sequence of the 

polypeptides of any one of SEQ ID NO:1787-3572 and 5359-7144. The polynucleotides of the 
present invention also include, but are not limited to, a polynucleotide that hybridizes under 
stringent conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO: 1- 
1786 and 3573-5358 ; (b) nucleotide sequences encoding any one of the amino acid sequences 

25 set forth in the Sequence Listing; (c) a polynucleotide which is an allelic variant of any 

polynucleotide recited above; (d) a polynucleotide which encodes a species homolog of any of 
the proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 
specific domain or truncation of the polypeptides of SEQ ID NO:1787-3572 and 53 59-7144. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 

30 receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 
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The polynucleotides of the invention include naturally occurring or wholly or partially 
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDN A. 

5 The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3 f sequence can 

1 0 be obtained using methods known in the art. For example, full length cDNA or genomic DNA that 
corresponds to any of the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 can be obtained 
by screening appropriate cDNA or genomic DNA libraries under suitable hybridization conditions 
using any of the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 or a portion thereof as a 
probe. Alternatively, the polynucleotides of SEQ ID NO: 1-1786 and 3573-5358 may be used as the 

1 5 basis for suitable primer(s) that allow identification and/or amplification of genes in appropriate 
genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and sequences 
(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 

20 representative fragment or segment information, or novel segment information for the full-length 
gene. 

The polynucleotides of the invention also provide polynucleotides including nucleotide 
sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 

25 75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO:l-1786 and 3573-5358, or complements thereof, which fragment is greater than 

30 about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and 

most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more 
that are selective for (i.e. specifically hybridize to any one of the polynucleotides of the 
invention) are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention from other polynucleotide sequences in 

15 
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the same family of genes or can differentiate human genes from genes of other species, and are 

preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 

specific sequences, but also include allelic and species variations thereof. Allelic and species 

5 variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1-1786 

and 3573-5358, a representative fragment thereof, or a nucleotide sequence at least 90% identical, 

preferably 95% identical, to SEQ ID NO:l-1786 and 3573-5358 with a sequence from another 

isolate of the same species. Furthermore, to accommodate codon variability, the invention includes 

nucleic acid molecules coding for the same amino acid sequences as do the specific ORFs disclosed 

1 0 herein. In other words, in the coding region of an ORF, substitution of one codon for another codon 
that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
including SEQ ID NO: 1 -1 786 and 3573-5358, can be obtained by searching a database using an 
algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool 

1 5 is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 290-300 (1993) and 
Altschul S.F. et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a FASTA version 3 search 
against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 

20 suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or 
proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 

25 polynucleotides. 

The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 

30 sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 

35 will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
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hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

choices (e.g., hydrophobic amino acid to a charged amino acid), and then deletions or insertions 

may be made at the target site. Amino acid sequence deletions generally range from about 1 to 

30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 

5 insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 

hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 

residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 

preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 

sequences necessary for secretion or for intracellular targeting in different host cells and 

1 0 sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 
In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 

1 5 site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 

20 When small amounts of template DNA are used as starting material, primer(s) that differs 

slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 

25 gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et aL, Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 

30 code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 
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Polynucleotides encoding preferred polypeptide truncations of the invention can be used 

to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 

domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 

5 polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 

synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 

to those of skill in the art and can include, for example, methods for determining hybridization 

conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 

1 0 protein coding sequences corresponding to any one of SEQ ID NO: 1 -1 786 and 3573-5358, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that direct 
the expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. 
Also included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 

15 nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 

20 invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 

25 organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO: 1-1 786 and 3573-5358 or a fragment 
thereof or any other polynucleotides of the invention. In one embodiment, the recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 

30 which a nucleic acid having any of the nucleotide sequences of SEQ ID NO:l-l 786 and 3573- 
5358 or a fragment thereof is inserted, in a forward or reverse orientation. In the case of a vector 
comprising one of the ORFs of the present invention, the vector may further comprise regulatory 
sequences, including for example, a promoter, operably linked to the ORF. Large numbers of 
suitable vectors and promoters are known to those of skill in the art and are commercially 

35 available for generating the recombinant constructs of the present invention. The following 
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vectors are provided by way of example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, 

pBs KS, pNH8a, pNH16a, pNH18a, pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, 

pDR540, pRIT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) 

pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

5 The isolated polynucleotide of the invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufinan et al., 

Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 

suitable expression control sequences are known in the art. General methods of expressing 

recombinant proteins are also known and are exemplified in R. Kaufinan, Methods in 

10 Enzymology 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

1 5 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

20 Generally, recombinant expression vectors will include origins of replication and selectable 

markers permitting transformation of the host cell, e.g., the ampicillin resistance gene of E. coli 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 

25 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

30 characteristics, e.g. , stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors, for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenotypic selectable markers and an origin of replication to ensure maintenance of the 

35 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
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transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use 
5 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combined with an appropriate promoter and the structural 

1 0 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifugation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 

15 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et aL, Nat. Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 

20 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 



4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
25 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NO:l-1786 and 3573-5358, or fragments, analogs or derivatives thereof. 
An "antisense" nucleic acid comprises a nucleotide sequence that is complementary to a "sense" 
nucleic acid encoding a protein, e.g., complementary to the coding strand of a double-stranded 
cDNA molecule or complementary to an mRNA sequence. In specific aspects, antisense nucleic 
30 acid molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 100, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic 
acid molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO:1787-3572 and 5359-7144 or antisense nucleic acids complementary to a nucleic 
acid sequence of SEQ ID NO: 1-1 786 and 3573-5358 are additionally provided. 
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In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
5 "noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5 f and 3 f sequences which flank the coding region that are not 
translated into amino acids (i,e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g., SEQ ID 
NO:l-1786 and 3573-5358 , antisense nucleic acids of the invention can be designed according 

10 to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule 
can be complementary to the entire coding region of a mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of a mRNA. 
For example, the antisense oligonucleotide can be complementary to the region surrounding the 
translation start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 

15 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can 
be chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase the 

20 physical stability of the duplex formed between the antisense and sense nucleic acids, e.g. , 
phosphorothdoate derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 

25 2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 

2- methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5 ! -methoxycarboxymethyluracil, 5-methoxyuracil, 

30 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 

35 nucleic acid has been subcloned in an antisense orientation (i.e., KNA transcribed from the 
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inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 

described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 

subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 

5 genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 

protein, e.g., by inhibiting transcription and/or translation. The hybridization can be by 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 

an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 

the major groove of the double helix. An example of a route of administration of antisense 

10 nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g. , 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 

1 5 receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

20 cc-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

double-stranded hybrids with complementary RNA in which, contrary to the usual p-units, the 
strands run parallel to each other (Gaultier et al (1987) Nucleic Acids Res 15: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2 , -o-methylribonucleotide (Inoue et al. 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al (1987) 

25 FEBSLett 21 5: 327-330). 



4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 

30 single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID NO:l- 

35 1786 and 3573-5358). For example, a derivative of a Tetrahymena L-l 9 IVS RNA can be 
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constructed in which the nucleotide sequence of the active site is complementary to the 

nucleotide sequence to be cleaved in a SECX-encoding mRNA. See, e.g., Cech et ah U.S. Pat. 

No. 4,987,071; and Cech et ah U.S. Pat. No. 5,1 16,742. Alternatively, SECX mRNA can be 

used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 

5 molecules. See, e.g., Bartel et ah, (1993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 

complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 

structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 

Anticancer Drug Des. 6: 569-84; Helene. et ah (1992) Ann. N.Y.Acad. Set. 660:27-36; and 

1 0 Maher (1 992) Bioassays 1 4: 807-1 5. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et ah (1996) BioorgMed 

1 5 Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 

20 standard solid phase peptide synthesis protocols as described in Hyrup et ah (1996) above; 
Peny-O'Keefe et ah (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 

25 PNAs of the invention can also be used, e.g., in the analysis of single base pair mutations in a 
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et ah (1 996), above; Perry-O'Keefe (1996), 
above). 

30 In another embodiment, PNAs of the invention can be modified, e.g., to enhance their 

stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 

35 enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
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portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 

using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 

the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 

can be performed as described in Hyrup (1996) above and Finn et al (1996) Nucl Acids Res 24: 

5 3357-63. For example, a DNA chain can be synthesized on a solid support using standard 

phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 

5 , -(4-methoxytrityl)amino-5'-deoxy-thymidine phosphoramidite, can be used between the PNA 

and the 5' end of DNA (Mag et al (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 

coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 

10 DNA segment'(Finn et al (1996) above). Alternatively, chimeric molecules can be synthesized 

with a 5' DNA segment and a 3* PNA segment. See, Petersen et al (1975) Bioorg Med Chem 

LettS: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides {e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 

15 cell membrane (see, e.g., Letsinger et al, 1989, Proc. Natl Acad. Sci. U.S.A. 86:6553-6556; 
Lemaitre et al, 1987, Proc. Natl. Acad Sci. 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 

20 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 

peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 



4.5 HOSTS 

25 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

30 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

35 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
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the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively linked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 
DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1 986)). The host cells containing one of the 
polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 
invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B, subtilis. 
The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the CI 27, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 

25 
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HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for example, 
5 SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
10 Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 

1 5 or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 

Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 

20 may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 

25 inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 

30 negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
sites, leader sequences for enhancing or modifying transport or secretion properties of the 
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protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
5 enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 

10 sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 

1 5 selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of. the negatively selectable marker. Markers useful for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 

20 phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al.; and International Application No. 

25 PCT/US90/06436 (W09 1/06667) by Skoultchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 
30 comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1787-3572 and 5359- 
7144 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO:l- 
1786 and 3573-5358 or the corresponding full length or mature protein. Polypeptides of the 
invention also include polypeptides preferably with biological or immunological activity that are 
encoded by: (a) a polynucleotide having any one of the nucleotide sequences set forth in SEQ ID 
35 NO:l-1786 and 3573-5358 or (b) polynucleotides encoding any one of the amino acid sequences 
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set forth as SEQ ID NO:1787-3572 and 5359-7144 or (c) polynucleotides that hybridize to the 

complement of the polynucleotides of either (a) or (b) under stringent hybridization conditions. 

The invention also provides biologically active or immunologically active variants of any of the 

amino acid sequences set forth as SEQ ID NO:1787~3572 and 5359-7144 or the corresponding 

5 full length or mature protein; and "substantial equivalents" thereof (e.g., with at least about 

65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 

about 90%, typically at least about 95%, more typically at least about 98%, or most typically at 

least about 99% amino acid identity) that retain biological activity. Polypeptides encoded by 

allelic variants may have a similar, increased, or decreased activity compared to polypeptides 

10 comprising SEQ ID NO:1787-3572 and 5359-7144. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear form or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et aL, Bio/Technology 10, 773-778 (1992) and in R. S. McDowell, et al., J. Amer. 
15 Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
20 sequence is identified in the sequence listing by translation of the disclosed nucleotide 

sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
25 provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
30 fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any one of the 

isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 

sequence can be synthesized using commercially available peptide synthesizers. The 

synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 

5 structural and/or conformational characteristics with proteins may possess biological properties 

in common therewith, including protein activity. This technique is particularly useful in 

producing small peptides and fragments of larger polypeptides. Fragments are useful, for 

example, in generating antibodies against the native polypeptide. Thus, they may be employed 

as biologically active or immunological substitutes for natural, purified proteins in screening of 

10 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or 

15 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer- Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 

35 Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1787-3572 and 5359-7144. 

The protein of the invention may also be expressed as a product of transgenic animals, 
e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 
5 an insect expression system. Materials and methods for baculovirus/insect cell expression 

systems are commercially available in kit form from, e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1 987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 

10 invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 

15 of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

20 Alternatively, the protein of the invention may also be expressed in a form which will 

facilitate purification. For example, it may be expressed as a fusion protein, such as those of 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 

25 respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 

30 aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 



31 



WO 01/53312 PCT/US00/34263 

The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 
Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well 
10 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 

20 programs including, but are not limited to, the GCG program package, including GAP 

(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 

25 Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
105-31 (1982), incorporated herein by reference). The BLAST programs are publicly available 

30 from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
35 protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 

correspond to all or a portion of a protein according to the invention. In one embodiment, a 

fusion protein comprises at least one biologically active portion of a protein according to the 

invention. In another embodiment, a fusion protein comprises at least two biologically active 

5 portions of a protein according to the invention. Within the fusion protein, the term "operatively 

linked" is intended to indicate that the polypeptide according to the invention and the other 

polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 

C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

1 0 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

1 5 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e,g, cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

35 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fusion moiety (e.g. , a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in-frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

1 0 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

1 5 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1 998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

30 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 
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the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
5 homologous recombination) to provide increased polypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/12650, PCT International PublicationNo. WO 92/20808, and PCT 

10 International Publication No. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase,and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplification of the marker DNA by standard selection methods results in 

1 5 co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 

20 replace a gene's existing regulatory region with a regulatory sequence isolated from a different gene 
or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaffold-attachment regions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 

25 protein produced may be replaced, removed, added, or otherwise modified by targeting. These 

sequences include polyadenylation signals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 

30 under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 

3 5 occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitated by the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitated by the use of one or more marker genes exhibiting the 
5 property of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 

1 0 xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. PCT/US92/09627 
(WO93/09222)by SeldenetaL; and International Application No. PCT/US90/06436 

15 (WO91/06667)by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
refeired to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

30 Publication No. W094/28 122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

35 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 
5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 
1 0 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 
15 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
20 Publication No. W094/28122, incorporated herein by reference. 

Transgenic aniriials can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 
25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 

1 0 indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

•The polypeptides of the present invention may likewise be involved in cellular activation 

15 or in one of the other physiological pathways described herein. 



4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high- throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 
5 receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 
10 Any or all of these research utilities are capable of being developed into reagent grade or 

kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
15 and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
20 sources or supplements. Such uses include without limitation use as a protein or amino acid 

supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
25 polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

30 A polypeptide of the present invention may exhibit activity relating to cytokine, cell 

proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 

35 or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI , 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 

Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Bertagnolli et al., J. Immunol. 
145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; Bertagnolli, 
et al, I. Immunol. 149:3778-3783, 1992; Bowman et al, I. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 
include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et al., J. Exp. Med. 173:1205-121 1, 1991; Moreau et al., Nature 336:690-692, 1988; 
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983; Measurement of mouse 
and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 11-Bennett, F., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 
9~Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 

40 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et al., Proc. Natl Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et al., Eur. J. Immun. 1 1 :405-411, 1981; Takai et al., J. Immunol. 
5 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988. 



4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 

10 cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 

1 5 large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

20 for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 

25 3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

inflammatory protein 1 -alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic fibroblast 
growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 

30 these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 

35 with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 

undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

10 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

15 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-182, (1991); Klug et al, J. Clin. Invest, 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 

30 Academic Press (1 997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In viti-o cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al. Blood, 77: 2316-2321 (1991). 



4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

10 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating utility, 
for example, in treating various anemias or for use in conjunction with irradiation/chemotherapy 

15 to stimulate the production of erythroid precursors and/or erythroid cells; in supporting the 

growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to prevent or 
treat consequent myelo-suppression; in supporting the growth and proliferation of 
megakaryocytes and consequently of platelets thereby allowing prevention or treatment of 

20 various platelet disorders such as thrombocytopenia, and generally for use in place of or 

complimentary to platelet transfusions; and/or in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
5 Proc. Natl. Acad. Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
10 Wiley-Liss, Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

15 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of burns, incisions and ulcers. 

20 A polypeptide of the present invention which induces cartilage and/or bone growth in 

circumstances where bone is not normally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 

25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 
useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 

30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/Iigament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has application 
in the healing of tendon or ligament tears, deformities and other tendon or ligament defects in 
5 humans and other animals. Such a preparation employing a tendon/ligament-like tissue inducing 
protein may have prophylactic use in preventing damage to tendon or ligament tissue, as well as 
use in the improved fixation of tendon or ligament to bone or other tissues, and in repairing 
defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation induced by 
a composition of the present invention contributes to the repair of congenital, trauma induced, or 

1 0 other tendon or ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention may 
provide environment to attract tendon- or ligament-forming cells, stimulate growth of tendon- or 
ligament-forming cells, induce differentiation of progenitors of tendon- or ligament-forming 
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect 

1 5 tissue repair. The compositions of the invention may also be useful in the treatment of tendinitis, 
carpal tunnel syndrome and other tendon or ligament defects. The compositions may also include 
an appropriate matrix and/or sequestering agent as a carrier as is well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central and peripheral 

20 nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 

25 lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 

accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

30 Compositions of the invention may also be useful to promote better or faster closure of 

non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 

35 kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
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endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 

desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 

to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
5 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above' from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 
10 Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 
1 5 Assays for wound healing activity include, without limitation, those described in: Winter, 

Epidermal Wound Healing, pps. 71-1 12 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 

20 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 

25 severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 

proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 

30 treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, i.e., in the treatment of cancer. 
Autoimmune disorders which may be treated using a protein of the present invention 

35 include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
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rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
5 reactions and conditions (e.g. , anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens- Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 

10 (particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al., Toxicology 125: 59-66, 

15 1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et al., Arch. ToxocoL 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 

20 immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 

25 in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 

30 limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lympholcine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 

35 followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
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composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
5 of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 

10 rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al., Science 257:789-792 (1 992) and Turka et al., Proc. Natl. Acad. Sci USA, 89: 1 1 1 02-1 1 1 05 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 

1 5 compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 

20 reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 

25 autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 

30 840-856). 

Upregulation of an antigen function (e.g., a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
responses may be in the form of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be useful in cases of viral 
35 infection, including systemic viral diseases such as influenza, the common cold, and encephalitis. 

48 
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Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
5 patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient. The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

10 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immune response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain truncated portion) of an 

15 MHC class I alpha chain protein and p 2 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

20 an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

25 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 

30 • Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et al., J. 
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986; Takai et al., J. 
Immunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998; Bertagnolli et al., 

35 Cellular Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-3092, 1994. 
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Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
5 Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
10 M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; Bertagnolli et al., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed by 
15 dendritic cells that activate naive T-cells) include, without limitation, those described in: Guery 
et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
173:549-559, 1991; Macatonia et al, Journal of Immunology 154:5071-5079, 1995; Porgador et 
al., Journal of Experimental Medicine 182:255-260, 1995; Nair et al, Journal of Virology 
67:4062-4069, 1993; Huang et al., Science 264:961-965, 1994; Macatonia et al., Journal of 
20 Experimental Medicine 169: 1255-1264, 1989; Bhardwaj et al., Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al., Cytometry 
25 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al., Cancer Research 
53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et al., International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
30 include, without limitation, those described in: Antica et al., Blood 84:1 1 1-1 17, 1994; Fine et al., 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al., Blood 85:2770-2778, 1995; Toki et al., 
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 

4.10.8 ACTIVIN/INHIBIN ACTIVITY 
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A polypeptide of the present invention may also exhibit activin- or inhibin-related 

activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 

characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 

stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 

5 release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 

alone or in heterodimers with a member of the inhibin family, may be useful as a contraceptive 

based on the ability of inhibins to decrease fertility in female mammals and decrease 

spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 

induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 

1 0 homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat. No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 

15 animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al, Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et al, Nature 

20 321 :776-779, 1986; Mason et al., Nature 3 1 8:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 



4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
25 activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
30 modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immune responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
35 stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
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Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the following: 
Assays for chemotactic activity (which will identify proteins that induce or prevent 
chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells 
across a membrane as well as the ability of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
1995; Muller et al Eur. J: Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be useful in treatment of various coagulation disorders (including 
hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

Therapeutic compositions of the invention can be used in the following: 
Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., Thrombosis Res. 
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
example, the presence or increased expression of a polynucleotide/polypeptide of the invention 

52.. 
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may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 

Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 

condition. Identification of single nucleotide polymorphisms associated with cancer or a 

predisposition to cancer may also be useful for diagnosis or prognosis. 

5 Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 

inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 

and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 

compositions of the invention may be effective in adult and pediatric oncology including in solid 

phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 

1 0 cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

1 5 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 

20 nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 

25 administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

30 The composition can also be administered in therapeutically effective amounts as a 

portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 

35 with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
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Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (VI 6-2 13), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
5 Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HC1 (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 

1 0 Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 
exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically 

1 5 effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 
cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 

20 tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst., 52: 921-30 

(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol., 40: 1 1 89-97 (1999) and Li et al., 

25 Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
30 receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
and their ligands (including without limitation, cellular adhesion molecules (such as selectins, 
35 integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
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recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-Iigand activity include without limitation those described in: 
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kxuisbeek, D. H. Margulies, E. M. 
Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 
Natl. Acad. Sci.USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1145-1156, 1988; 
Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995. 

By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, affinity chromatography, dihybrid screening assays, BI Acore assays, gel 
overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 
Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990) Academic 
Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screening 
utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 
nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
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transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of complexes 
between polypeptides of the invention or fragments and the agent being tested or examine the 
diminution in complex formation between the novel polypeptides and an appropriate cell line, 
which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
review, see Science 282:63-68 (1 998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PGR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Dorner et al., BioorgMed Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries described herein permits 
modification of the candidate "hit" (or "lead") to optimize the capacity of the "hit" to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
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molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 

5 4,10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 

10 to identify polynucleotides encoding binding partners. As another example, affinity 

chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (i.e., increase or decrease) biological activity of a polypeptide of the invention. 

1 5 Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 

20 polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

25 The role of downstream intracellular signaling molecules in the signaling cascade of the 

polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 

30 the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications i.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 
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Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
5 process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 

10 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such as TNF or IL-1 . Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

15 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

20 intrauterine infections. 



4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
25 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblasts, promyelocyte, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et al, 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

30 

4.10-17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
35 therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
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disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
5 nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
10 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or injured 
as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

15 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

20 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

25 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

30 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

35 system disorder may be selected by testing for biological activity in promoting the survival or 
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differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 

5 (iii) increased production of a neuron-associated molecule in culture or in vivo, e.g., 

choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method known in the art. In preferred, 
non-limiting embodiments, increased survival of neurons may be measured by the method set 

10 forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 
depending on the molecule to be measured; and motor neuron dysfunction may be measured by 

1 5 assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 

20 well as other components of the nervous system, as well as disorders that selectively affect 

neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 

25 (Charcot-Marie-Tooth Disease). 



4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
30 including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as, for example, breast augmentation or diminution, change in bone form or shape); 
effecting biorhythms or circadian cycles or rhythms; effecting the fertility of male or female 
35 subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
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elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 

nutritional factors or component(s); effecting behavioral characteristics, including, without 

limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 

(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 

5 reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 

than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 

deficiencies of the enzyme and treating deficiency-related diseases; treatment of 

hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 

as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 

10 in a vaccine composition to raise an immune response against such protein or another material or 

entity which is cross-reactive with such protein. 



4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
15 polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
20 polymorphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 

25 involving isolation or amplification of the DNA, and identifying the presence of the 

polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 

30 single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 

adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 

35 present invention can be used to detect polymorphisms. The array can comprise modified 
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nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence could 
5 also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence. 

4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
10 arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl. Immunol., 23:129. 
Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund f s adjuvant (CFA). The 
1 5 route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 

mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of intradermally 
injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
20 test compound and subsequent treatment every other day until day 24. At 14, 15, 1 8, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 
would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

25 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
30 include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
35 disorder that can be modulated by regulating the peptides of the invention. While the mode of 
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administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
5 condition and response of the individual patient. Typically, the amount of polypeptide 

administered per dose will be in the range of about O.Oljxg/kg to 100 mg/kg of body weight, with 
the preferred dose being about 0. 1 jiig/kg to 1 0 mg/kg of patient body weight. For parenteral 
administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
10 and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

1 5 4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 
including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 

20 to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 

25 effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 
M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, 
IL-13, IL-14, IL-15, IFN, TNF0, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 

30 factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
factor (PDGF), transforming growth factors (TGF-a and TGF-P), insulin-like growth factor 
(IGF), as well as cytokines described herein. 
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The pharmaceutical composition may further contain other agents which either enhance 

the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
5 invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or antithrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 

10 IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers or homodimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a proteiij of the invention in such multimeric or complexed form. 
As an alternative to being included in a pharmaceutical composition of the invention 

1 5 including a first protein, a second protein or a therapeutic agent may be concurrently 

administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, latest 

20 edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 

25 combination, a therapeutically effective dose refers to combined amounts of the active 

ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
effective amount of protein or other active ingredient of the present invention is administered to 

30 a mammal having a condition to be treated. Protein or other active ingredient of the present 

invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
hematopoietic factors, protein or other active ingredient of the present invention may be 

35 administered either simultaneously with the cytokine(s), lymphokine(s), other hematopoietic 
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factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 

the attending physician will decide on the appropriate sequence of administering protein or other 

active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 

hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

5 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 

10 intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

15 Alternately, one may administer the compound in a local rather than systemic manner, for 

example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often in 
a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 

20 system, for example, in a liposome coated with a specific antibody, targeting, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
dosage to the desired site of action. The determination of a suitable route of administration and 
25 an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinicians provide maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
comprising excipients and auxiliaries which facilitate processing of the active compounds into 
35 preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
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manufactured in a manner that is itself known, e.g., by means of conventional mixing, 

dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration chosen. 
When a therapeutically effective amount of protein or other active ingredient of the present 
invention is administered orally, protein or other active ingredient of the present invention will 
be in the form of a tablet, capsule, powder, solution or elixir. When administered in tablet form, 
the pharmaceutical composition of the invention may additionally contain a solid carrier such as 
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% protein or 
other active ingredient of the present invention, and preferably from about 25 to 90% protein or 
other active ingredient of the present invention. When administered in liquid form, a liquid 
carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, mineral oil, 
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous solutions, 
preferably in physiologically compatible buffers such as Hanks's solution, Ringer's solution, or 
physiological saline buffer. For transmucosal administration, penetrants appropriate to the 
barrier to be permeated are used in the formulation. Such penetrants are generally known in the 
art. 

For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutical^ acceptable carriers well known in the art. Such carriers 
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enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
5 suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 

10 may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 

1 5 added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 
sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 

20 lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 

optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 

25 tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 

30 other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 

providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use in 
an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 

35 injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
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an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
5 the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 

10 dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 
suitable vehicle, e.g. , sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 

1 5 retention enemas, e.g. , containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 
implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 

20 materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 

25 of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 
system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 

30 without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 

35 hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
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known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 
Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
5 Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

10 The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 

or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 
polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 

1 5 acceptable base addition salts are those salts which retain the biological effectiveness and 

properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
the like. 

20 The pharmaceutical composition of the invention may be in the form of a complex of the 

protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 
lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 

25 presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 
MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 

30 well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The phaimaceutical composition of the invention may be in the form of a liposome in 
which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 

35 micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
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lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 
the condition being treated, and on the nature of prior treatments which the patient has 
undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 
attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 
of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 \xg to about 100 mg (preferably about 0.1 |ug to about 10 mg, more preferably 
about 0. 1 iig to about 1 mg) of protein or other active ingredient of the present invention per kg 
body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic, 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 
delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 
cartilage formation, the composition would include a matrix capable of delivering the 
protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 

The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interface properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
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may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 

hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
5 matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
10 biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 

glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 
cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

15 A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 

(including hydroxy alkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 
carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 

20 polyethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and polyvinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 

25 protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 

compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 
question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-oc and TGF-P), and 

30 insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 
patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 

35 regeneration will be determined by the attending physician considering various factors which 
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modify the action of the proteins, e.g., amount of tissue weight desired to be formed, the site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue {e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution and 
5 with inclusion of other proteins in the pharmaceutical composition. For example, the addition of 
other known growth factors, such as IGF I (insulin like growth factor I), to the final composition, 
may also effect the dosage. Progress can be monitored by periodic assessment of tissue/bone 
growth and/or repair, for example, X-rays, histomorphometric determinations and tetracycline 
labeling. 

10 Polynucleotides of the present invention can also be used for gene therapy. Such 

polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 

1 5 proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 

20 compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 
effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 

25 the method of the invention, the therapeutically effective dose can be estimated initially from 

appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve, a circulating 
concentration range that includes the IC50 as determined in cell culture (i.e., the concentration of 

30 the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 

35 cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 
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population) and the ED 50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD50 and ED50. Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
5 range of dosage for use in human. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, e.g. , Fingl et al., 1975, in "The 

10 Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 

1 5 bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local, 
administration or selective uptake, the effective local concentration of the drug may not be 

20 related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 |iig/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0. 1 |ug/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 

25 intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

30 4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
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invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 



4.13 ANTIBODIES 

5 Also included in the invention are antibodies to proteins, or fragments of proteins of the 

invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a b, F a b* and F( a b')2 

10 fragments, and an F a b expression library. In general, an antibody molecule obtained from 

humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGj, IgG2, and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 

1 5 subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal . 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

20 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

antigenic peptide fragment comprises at least 6 amino, acid residues of the amino acid sequence 
of the full length protein, such as an amino acid sequence shown in SEQ ID NO: 1787, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
immune complex with the full length protein or with any fragment that contains the epitope. 

25 Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein that are located on its 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 

30 antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 
hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 
to encode surface residues useful for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 

35 may be generated by any method well known in the art, including, for example, the Kyte 
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Doolittle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proc. Nat. Acad Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
Moi BioL 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
5 fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
10 monoclonal antibodies directed against a protein of the invention, or against derivatives, 1 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

1 5 5.13.1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 

20 protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 

25 adjuvant. Various adjuvants used to increase the immunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 

30 adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 

35 fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
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target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5.13.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature , 256:495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fragment thereof or a fusion 
protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or sploen cells or lymph node cells are used if non-human mammalian sources are 
desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal Antibodies: 
Principles and Practice . Academic Press, (1986) pp. 59-103). Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human origin. 
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can be cultured in 
a suitable culture medium that preferably contains one or more substances that inhibit the growth 
or survival of the unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium for 
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 
medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
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California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J, Immunol , 133:3001 (1984); Brodeur et al., Monoclonal 
Antibody Production Techniques and Applications , Marcel Dekker, Inc., New York, (1987) pp. 
51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 
the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochenu 107:220 (1980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 
dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 
The monoclonal antibodies secreted by the subclones can be isolated or purified from the culture 
medium or ascites fluid by conventional immunoglobulin purification procedures such as, for 
example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, or 
affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368 , 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
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polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5 5.13.2 Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further comprise 
humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 

1 0 immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab') 2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature, 321 :522-525 (1986); Riechmann et al, Nature. 332:323-327 (1988); Verhoeyen et al., 

15 Science. 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 

corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 

20 humanized antibody will comprise substantially all of at least one, and typically two, variable 

domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 

25 immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol. . 
2:593-596(1992)). 

5.133 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
30 sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: Monoclonal 
35 Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
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antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 
5 In addition, human antibodies can also be produced using additional techniques, 

including phage display libraries (Hoogenboom and Winter, J. Mol. Biol. , 227:381 (1991); 
Marks et al., J. Mol. Biol. , 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 

1 0 challenge, human antibody production is observed, which closely resembles that seen in humans 
in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779-783 (1992)); Lonberg et al. 
fNature 368 856-859 (1994)); Morrison ( Nature 368 , 812-13 (1994)); Fishwild et al,( Nature 

1 5 Biotechnology 14, 845-5 1 (1 996)); Neuberger (Nature Biotechnology 14, 826 (1 996)); and 
Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 

20 endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 

25 transgenic animals containing fewer than the full complement of the modifications. The 

preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells 
which secrete fully human immunoglobulins. The antibodies can be obtained directly from the 
animal after immunization with an immunogen of interest, as, for example, a preparation of a 

30 polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 
hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 
immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 
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An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
5 locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
10 U.S. Patent No. 5,916,771 . It includes introducing an expression vector that contains a 

nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 
an expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 
15 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

20 5.13.4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g., 
Huse, et aL, 1989 Science 246: 1275-1281) to allow rapid and effective identification of 

25 monoclonal F a b fragments with the desired specificity for a protein or derivatives, fragments, 

analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
may be produced by techniques known in the art including, but not limited to: (i) an F( a b-)2 
fragment produced by pepsin digestion of an antibody molecule; (ii) an F ab fragment generated 
by reducing the disulfide bridges of an F w fragment; (iii) an F a b fragment generated by the 

30 treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 

5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
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binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture often different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et aL, 1991 EMBO J., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
aL, Methods in Enzvmology. 121:210 (1986). 

According to another approach described in WO 96/27011, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et aL, Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab') 2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
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stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
derivatives is then reconverted to the Fab '-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab' -TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab') 2 molecule. Each Fab' fragment 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant cell culture have also been described. For example, bispecific antibodies have been 
produced using leucine zippers. Kostelny et al., J. Immunol. 148(5): 1547- 1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
also be utilized for the production of antibody homodimers. The "diabody" technology 
described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 
heavy-chain variable domain (V H ) connected to a light-chain variable domain (V L ) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
the V H and V L domains of one fragment are forced to pair with the complementary V L and V H 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRIII (CD 16) so as to focus cellular 
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defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
5 " binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 

10 have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 

1 5 Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
20 to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1191-1195(1992) 
25 and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

30 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
35 radioconjugate). 
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Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
5 Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 
radionuclides are available for the production of radioconjugated antibodies. Examples include 
212 Bi, 131 I, I31 In, 9 °Y,and 186 Re. 

1 0 Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 

protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 

15 bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et aL, Science, 238: 1098 (1987). 
Carbon- 14-labeled l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 

20 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.g., avidin) that is in turn 

25 conjugated to a cytotoxic agent. 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 

30 any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 

35 be used to create a manufacture comprising computer readable medium having recorded thereon 
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a nucleotide sequence of the present invention. As used herein, "recorded 11 refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 
5 A variety of data storage structures are available to a skilled artisan for creating a 

computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 

10 readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats {e.g. text file or database) in order to obtain computer readable medium having recorded 

1 5 thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1786 and 3573-5358 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of the 
nucleotide sequences of SEQ ID NO: 1-1 786 and 3573-5358 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. Computer 

20 software is publicly available which allows a skilled artisan to access sequence information 
provided in a computer readable medium. The examples which follow demonstrate how 
software which implements the BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990)) and 
BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase system 
is used to identify open reading frames (ORFs) within a nucleic acid sequence. Such ORFs may 

25 be protein encoding fragments and may be useful in producing commercially important proteins 
such as enrymes used in fermentation reactions and in the production of commercially useful 
metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
means, and data storage means used to analyze the nucleotide sequence information of the 

30 present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
computer-based systems of the present invention comprise a data storage means having stored 

35 therein a nucleotide sequence of the present invention and the necessary hardware means and 
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software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory which can store nucleotide sequence information of the present 
invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 
5 As used herein, "search means" refers to one or more programs which are implemented 

on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif A variety of known algorithms are disclosed publicly and a variety of commercially 

1 0 available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 

15 computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 

20 residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
25 three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 
sequences). 

30 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
35 Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
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designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et a]., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241 :456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
5 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

10 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
1 5 with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 
detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
20 comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
25 a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
30 binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in the art will recognize that any one of the commonly available hybridization, 
35 amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
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probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et aL, Techniques in Immunocytochemistry, 
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
5 and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
will vary based on the assay format, nature of the detection method and the tissues, cells or 

1 0 extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 

1 5 provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 

20 containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test 

25 sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 

30 reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 

4.17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharmaceutical^ acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l- 
1786 and 3573-5358, or bind to a specific domain of the polypeptide encoded by the nucleic 
acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 
invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 
polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 
comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 
binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
5 invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

10 For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled in the art can 

15 readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al., Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

20 In addition to the foregoing, one class of agents of the present invention, as broadly 

described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 

25 multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

30 Agents suitable for use in these methods preferably contain 20 to 40 bases and are 

designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 (1988); and Dervan et 
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J. Neurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 

35 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO: 1-1786 and 3573-5358. Because the corresponding gene is only 
expressed in a limited number of tissues, a hybridization probe derived from of any of the 
nucleotide sequences SEQ ID NO:l-1786 and 3573-5358 can be used as an indicator of the 
presence of RNA of cell type of such a tissue in a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 provides 
additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a rtiixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 
nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
5 of genetic map data can be found in the 1 994 Genome Issue of Science (265 : 1 98 1 f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

1 0 4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 
1 5 skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesized by standard synthesizers. Immobilization can be 
achieved using passive adsorption (Inouye & Hondo, (1 990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagataef al, 1985; Dahlen et al, 1987; Morrissey & Collins, (1989) Mol. Cell 
Probes3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et al, 1988; 1989); all 
20 references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotin-streptavidin 
interaction as a linker. For example, Broude et al (1994) Proc. Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylated probes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, Oslo. 
25 Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 

Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
30 surface termed Covalink NH. CovaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5 -end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussen^a/., (1991) Anal. Biochem. 198(1) 138-42). 
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The use of CovaLinkNH strips for covalent binding of DNA molecules at the 5 -end has 
been described (Rasmussenet al., (1991). In this technology, a phosphoramidate bond is employed 
(Chu et al., (1983) Nucleic Acids Res. 1 1(8) 6513-29). This is beneficial as immobilization using 
only a single covalent bond is preferred. The phosphoramidate bond joins the DNA to the 
5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLinkNH via an phosphoramidate bond, the oligonucleotide terminus must have a 5'-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidin used to bind the probes. 

1 0 More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 

denaturing for 1 0 min. at 95°C and cooling on ice for 1 0 min. Ice-cold 0. 1 M 1 -methylimidazole, 
pH 7.0 ( 1 -Melm 7 ), is then added to a final concentration of 1 0 mM 1 -Melm 7 . A ss DNA solution is 
then dispensed into CovaLinkNH strips (75 ul/well) standing on ice. 

Carbodiimide 0.2 M 1 -ethyl-3-(3-dimethylaminopropyl)-carbodiimide (EDC), dissolved in 

15 10 mM 1 -Melm 7 , is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 

20 described in PCT Patent Application WO 90/03382 (Southern & Maskos), incoiporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3 f -reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 

25 conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotection may be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 

30 Fodor et al (1 99 1 ) Science 25 1 (4995) 767-73, incorporated herein by reference. Probes may also 
be immobilized on nylon supports as described by Van Ness et al ( 1 99 1 ) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5'-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
5 light-generated synthesis described by Pease et al, (1994) PNAS USA 91(1 1) 5022-6, incorporated 
herein by reference). These authors used current photolithographic techniques to generate arrays of 
immobilized oligonucleotide probes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5 f -protected7V-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
1 0 combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
generated in this manner. - 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
1 5 including mRNA without any amplification steps. For example, Sambrook et al (1 989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in M 1 3 , plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PGR or other amplification methods. Samples 

20 may be prepared or dispensed in multiwell plates. About 1 00- 1 000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9.24-9.28 of Sambrook et 
al (1989), shearing by ultrasound and NaOH treatment. 

25 Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 

Acids Res. 1 8(24) 7455-6, incorporated herein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of these 
studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 

30 fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, Cv/JI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease CviJl normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
5 this enzyme (Cv/JI* *), yield a quasi-random distribution of DNA fragments form the small 
molecule pUC 1 9 (2688 base pairs). Fitzgerald et al (1 992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a Cv/JI* * digest of pUC 1 9 that was size 
fractionated by a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M 1 3 cloning vector. Sequence analysis of 76 clones showed that Cv/JI* * restricts pyGCPy and 
1 0 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
1 5 electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturation of the DNA fragments before they are contacted with the 
20 chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 

25 nylon membrane. By offset printing, a density of dots higher than the density of the wells is 

achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselected number of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 

30 subarrays may represent replica spotting of the same samples. In one example, a selected gene 

segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 1 2 cm membrane. 
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Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
5 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips, A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of skill in the art will appreciate that many other embodiments and variations 

1 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 

1 5 variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

20 5.0 EXAMPLES 

5.1.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 
using standard PCR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PCR using primers specific for the vector sequences which 
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and screened 
with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The clones were clustered 
into groups of similar or identical sequences. Representative clones were selected for sequencing. 

In some cases, the 5* sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend the sequence in the 5 ' direction. 
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5.1.2 EXAMPLE 2 
Assemblage of Novel Nucleic Acids 

The contigs or nucleic acids of the present invention, designated as SEQ ID NO: 3573-5358 
were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
5 the seed EST into an extended assemblage, by pulling additional sequences from different databases 
(i.e., Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 1 1 4, and UniGene 
version 101) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 
component sequences into the assemblage was based on a BLASTN hit to the extending assemblage 
1 0 with BLAST score greater than 300 and percent identity greater than 95%. . 

A polypeptide was predicted to be encoded by each of SEQ ID NO:3573-5358 as set forth 
below. The polypeptides was predicted using a software program called FASTY (available from 
http://fastabioch.virginia,edu > ) which selects a polypeptides based on a comparison of translated 
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in Enzymology, 183:63-98 
1 5 (1 990), herein incorporated by reference. The predicted polypeptides are shown in Table 7. 

5.2.2 EXAMPLE 3 
Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

20 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was . 
checked using FASTY and/or BLAST against Genebank. Other computer programs which may 
have been used in the editing process were phredPhrap and Consed (University of Washington) and 
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS : 1-327. 

25 Table 1 shows the various tissue sources of SEQ ID NO: 1 -327. 

The nearest neighbor results for SEQ ID NO: 1-327 were obtained by a FASTA version 3 
search against Genpept release 117, using FASTXY algorithm. FASTXY is an improved 
version of FASTA alignment which allows in-codon frame shifts. The nearest neighbor result 
showed the closest homologue for SEQ ID NO: 1-327 from Genpept . The translated amino acid 

30 sequences for which the nucleic acid sequence encodes are shown in the Sequence Listing. The 
nearest neighbor results for SEQ ID NO: 1-327 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
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signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p- value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 

10 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

1 5 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.3.2 EXAMPLE 4 

20 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 7, gb pri 1 1 7, 

25 UniGene version 1 1 7, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 
these procedures are shown in the Sequence Listing as SEQ ID NOS: 328-1 41 3. 
Table 1 shows the various tissue sources of SEQ ID NO: 328-141 3. 

30 The nearest neighbor results for SEQ ID NO: 328-1413 were obtained by a BLASTP 

version 2.0al 19MP-WashU search against Genpept release 1 18, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 328-1413 from Genpept. 
The translated amino acid sequences for which the nucleic acid sequence encodes are shown in 
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the Sequence Listing. The nearest neighbor results for SEQ ID NO: 328-1413 are shown in 
Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
5 examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
10 - examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 

1 5 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

20 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences.- Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 



25 5.3.2 EXAMPLE 5 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
30 checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 7, gb pri 1 1 7, 

UniGene version 1 1 7, Genpept release 117). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1414-1652. 
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Table 1 shows the various tissue sources of SEQ ID NO: 1414-1 652. 
The nearest neighbor results for SEQ ID NO: 1414-1652 were obtained by a BLASTP 
version 2.0al 19MP-WashU search against Genpept release 118, using BLAST algorithm. The 
nearest neighbor result showed the closest homologue for SEQ ID NO: 1414-1652 from 
5 Genpept. The translated amino acid sequences for which the nucleic acid sequence encodes are 
shown in the Sequence Listing. The nearest neighbor results for SEQ ID NO: 1414-1652 are 
shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 2 1 9-235 (1 999) herein incorporated by reference), all the sequences were 
10 examined to determine whether they had identifiable signature regions. Table 3 shows the 

signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et ah, Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
1 5 examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 

20 Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 

25 reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.4.2 EXAMPLE 6 
30 Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 118, gb pri 118, 
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UniGene version 1 1 8, Genpept release 118). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide sequences, including splice variants 
resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1653-1745. 
5 Table 1 shows the various tissue sources of SEQ ID NO: 1653-1745. 

The homology for SEQ ID NO: 1653-1745 were obtained by a BLASTP version 2.0al 
19MP-WashU search against Genpept release 118, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1653-1745 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
10 with identifiable functions for SEQ ID NO: 1653-1745 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
15 the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
20 within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
their cleavage sites can be determine from using Neural Network SignalP Vl.l program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
25 disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by 
reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
30 each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5.5,2 EXAMPLE 7 
Novel Nucleic Acids 
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Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 

sequence and its corresponding protein sequence were generated from the assemblage. Any frame 

shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 

checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 1 1 9, gb pri 1 1 9, 

5 UniGene version 1 1 9, Genpept release 1 1 9). Other computer programs which may have been used 

in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 

ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice variants resulting from 

these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 746-1 768. 

Table 1 shows the various tissue sources of SEQ ID NO: 1746-1768. 

10 The homology for SEQ ID NO: 1746-1768 were obtained by a BLASTP version 2.0al 

19MP-WashU search against Genpept release 119, using BLAST algorithm. The results showed 
homologues for SEQ ID NO: 1746-1768 from Genpept. The translated amino acid sequences for 
which the nucleic acid sequence encodes are shown in the Sequence Listing. The homologues 
with identifiable functions for SEQ ID NO: 1746-1768 are shown in Table 2 below. 

15 Using eMatrix software package (Stanford University, Stanford, CA) (Wu et aL, J. Comp. 

Biol., Vol. 6 pp. 21 9-235 (1 999) herein incorporated by reference), all the sequences were examined 
to determine whether they had identifiable signature regions. Table 3 shows the signature region 
found in the indicated polypeptide sequences, the description of the signature, the eMatrix p- 
value(s) and the position(s) of the signature within the polypeptide sequence. 

20 Using the PFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 

pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 
examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the PFam score for the identified domain 
within the sequence. 

25 The nucleotide sequence within the sequences that codes for signal peptide sequences and 

their cleavage sites can be determine from using Neural Network SignalP VI. 1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process for 
identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also disclosed by 
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the publication " 

30 Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites" 
Protein Engineering, Vol. 10, no. l,pp. 1-6 (1997), incorporated herein by reference. A maximum 
S score and a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 5 shows the position of the signal peptide in each of the polypeptides 
and the maximum score and mean score associated with that signal peptide. 
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5.6.2 EXAMPLE 8 

Novel Nucleic Acids 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any frame 
5 shifts and incorrect stop codons were corrected by hand editing. During editing, the sequence was 
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 120, gb pri 120, 
UniGene version 1 20, Genpept release 120). Other computer programs which may have been used 
in the editing process were phredPhrap and Consed (University of Washington) and ed-ready, ed- 
ext and gc-zip-2 (Hyseq, Inc.) . The translated amino acid sequences for which the nucleic acid 
1 0 sequence encodes are shown in the Sequence Listing. The full-length nucleotide, including splice 
variants resulting from these procedures are shown in the Sequence Listing as SEQ ID NOS: 1 769- 
1786. 

Table 1 shows the various tissue sources of SEQ ID NO; 1 769-1 786. 

The homology for SEQ ID NO: 1769-1786 were obtained by a BLASTP version 2.0al 
15 19MP-WashU search against Genpept release 120 and the amino acid version of Geneseq 
released on October 26, 2000, using BLAST algorithm. The results showed homologues for 
SEQ ID NO: 1769-1786 from Genpept. The homologues with identifiable functions for SEQ ID 
NO: 1769-1786 are shown in Table 2 below. 

Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al., J. Comp. 
20 Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the sequences were 
examined to determine whether they had identifiable signature regions. Table 3 shows the 
signature region found in the indicated polypeptide sequences, the description of the signature, 
the eMatrix p-value(s) and the position(s) of the signature within the polypeptide sequence. 

Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol. 26(1) 
25 pp. 320-322 (1998) herein incorporated by reference) all the polypeptide sequences were 

examined for domains with homology to certain peptide domains. Table 4 shows the name of 
the domain found, the description, the p-value and the pFam score for the identified domain 
within the sequence. 

The nucleotide sequence within the sequences that codes for signal peptide sequences and 
30 their cleavage sites can be determine from using Neural Network SignalP VI .1 program (from 
Center for Biological Sequence Analysis, The Technical University of Denmark). The process 
for identifying prokaryotic and eukaryotic signal peptides and their cleavage sites are also 
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the 
publication " Identification of prokaryotic and eukaryotic signal peptides and prediction of their 
35 cleavage sites" Protein Engineering, Vol. 10, no. 1 , pp. 1-6 (1997), incorporated herein by 

103 



WO 01/53312 PCT/US00/34263 

reference. A maximum S score and a mean S score, as described in the Nielson et as reference, 
was obtained for the polypeptide sequences. Table 5 shows the position of the signal peptide in 
each of the polypeptides and the maximum score and mean score associated with that signal 
peptide. 

5 Table 6 is a correlation table of all of the sequences and the SEQ ID NOS. 



104 



WO 01/53312 



PCT/US00/34263 



TABLE 1 



Tissue Origin 
adult brain 



RNA Source [ Hyseq 

Library Name 



SEQ ID NOS: 



GIBCO 



AB3001 



adult brain 



9 19-21 50-51 65-66 72 78 80 82 
85 87 107-108 113 116 123 138 
140 150-152 159 169 177 192-193 
202-203 212-214 225-226 235-236 
251 258 268-269 272 280-281 295 
298 301 321 326 331-332 334 356- 
357 362 369 379 382-383 416 423 
443 459-460 473 475 477 488 496 
500 503 519 526 547 574 582 587 
608-609 613 618 633-634 645-646 
652 657-658 660 669-671 678 687 
695 697 710 715 724 731 775-777 
796 804 811 857-859 862 869 899- 
900 912 919 922 924-929 933 936 
962 979 988-989 996 1001 1004- 
1008 1018 1039 1047 1059 1064 
1067 1070 1078 1082 1107 1113 
1116-1117 1131 1134-1137 1140 
1149 1151 1157 1180 1206 1229 
1234 1241 1243 1258 1272-1273 
1279 1288-1290 1294 1307-1308 
1312 1320 1323 1330 1356 1360- 
1361 1368 1373-1375 1379 1391 
1400 1417 1446 1468 1482 1493- 
1494 1501-1503 1506-1507 1512 
1517 1522-1524 1530-1533 1537 
1549 1565 1578 1598 1606 1608 
1623 1625 1627 1639 1643 1648- 
1649 1653 1664 1667 1671 1696 
1734 1741 1743-1744 1760-1761 
1771 



GIBCO 



ABD003 



3 12-14 18-19 25 30-31 34-36 43-" 
45 50-51 56 58 60 65-66 68-69 80 
82 85 87 92 104 107-108 112-113 
115-116 123-124 131-132 135-137 
139 142 146 148-149 152 154 157 
159 163 165 167 169 172 180 192- 
193 196-197 199 203 208 210 212- 
214 223 233 235-237 247 257 259 
261 268-269 272 276 280-281 284- 
288 291-292 295 297 300-301 304 
307 317 320-321 323 327 329-331 
333-334 345-349 356-357 379-381 
393 401 408 414 419 424 426-428 
430 433-436 438-439 443 445 449 
453-454 459-461 468 471-473 476- 
478 483 491 494 496 500 503 507- 
508 516 519-520 525-527 534 536- 
540 542-543 545 553 555 560 569- 
570 574-576 586-588 593 595 597 
601 606-609 616-620 622-623 625 
628-633 635-636 643 645-649 653 
655-656 660-665 668-670 676 681 
687 701 710 715 717 724-728 735 
743 745-746 750 753 759 765-766 
773 775-778 786 789 796 799-800 
802-803 810-811 815 817 820-821 
832 834-836 840 845-847 851 858- 
861 864 869 B74 878 883 897 901- 
902 904-905 908 911-914 916 921- 
922 924-927 929 932-934 936-939 
941-942 945 955-958 963 966-969 
977 979-980 985-986 990 992-993 
997-1001 1005-1007 1012 1017- 
1020 1023-1024 1029-1031 1034 
1036 1039 1050 1059 1063-1066 
1078 1081-1082 1085-1086 1089 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1107 
1121 

■1145 
1170 

■1194 
1226- 
1247 
1279 
1294 
1326 
1351 
1380 
1409 
1437 

•1459 

-1483 
1506 
1530- 
1552 
1569 
1595 

•1621 
1636 
1649 
1669 
1694- 
1722- 
1738 
1753 
1771 



1097 

1117 

1134 

1158 

1190 

1217 

1241 

1267 

1289 

1316- 

1344 

1374 

1394 

1425- 

1456 

1478 

1497 

1522- 

1548- 

1565 

1591 

1611 

1630- 

1645 

1664 

1686 

1711 

1731- 

1747 

1761 



1103 
1119 
1144 
1167 
1193 
1220 
1243 
1269 
1293 
1320 
1348 
1377 
1400 
1427 
1458 
1482 
1499 
1524 
1550 
1567 
1593 
1620 
1632 
1647 
1667 
1690 
1719 
1733 
1749 
1765 



adult brain 



1109 

1124 

1149 

1178 

1200 

1227 

1252 

1281 

1306- 

1333 

1355- 

1386 

1414 

1443 

1468 

1487- 

1508- 

1533 

1557- 

1571 

1598- 

1624- 

1640- 

1653- 

1673 

1696 

1723 

1740 

1757- 

1785 



1112 
1127 
1151 
1184 
1202 
1229 
1258 
1264 
1307 
1338 
1357 
1389 
1422 
1446 
1470 
1488 
1511 
1545 
1559 
15B6 
1601 
1626 
1641 
1655 
1678 
1701 
1726 
1743 
1758 



1116- 
1130 
1157- 
1188 
1215- 
1231 
1263 
1286- 
1312 
1341 
1368 
-1390 
-1423 
1454 
-1472 
1493 
1517 
-1546 
•1563 
158B 
1608 
1628 
1644- 
1657 
-1681 
1709 
-1727 
-1744 
1760- 



Clontech 



ABR001 



adult brain 



9 29 68-69 113 115 146 152 206 
223 245 277 307 320 324 330-331 
344 348 352 362 379 384 393 404 
408 414 441-442 454 469 481 490 
506 517 586 597 631 641 659 691 
715 799 803 033 865 871 875 880 
882 908 920 937 1000 1005-1006 
1027 1036 1041 1043 1075 1107 
1112 1121 1127 1136-1137 1144- 
1147 1231 1238-1239 1280 1293 
1320 1345 1355 1361 1383-1384 
1400 1417 1448 1456 1476 1507 
1570 1572 1609-1610 1614 1620 
1626 1645 1653 1754 1759 1770 
1786 



Clontech 



ABR006 



adult brain 



5-8 15-16 168 212-213 271 27B 
280-281 291-292 300-301 310 314 
321 326 336-338 341 352 357 359- 
360 362 369 374 379 384 393 396- 
397 414 419-420 426-428 430 441- 
442 453 506 616-617 661 689 785 
798 845 1018 1109 1113 1124 1148 
1167 1187 1207 1227 1262 1265 
1285 1312 1317-1319 1324-1327 
1344 1369 1381 1400 1416 1421 
1427 1430-1431 1436 1471 1501 
1557-1559 1586 1588 1651 1653 
1664-1665 1671 1673 1690 1697- 
1698 1700 1711 1717 1719-1720 
1728 1736 1740 1743-1744 1757 
1760-1761 



Clontech 



ABR008 



5-10 13-19 22-23 25 29 33 37-39 
43-45 50-51 54-55 57-58 60-66 
60-70 72 75 77-80 83 85 89-92 94 
99-105 108-110 112-113 116-117 
123 128 133 135-137 139 143 145- 
146 148 152 154-155 157 166 168- 
172 174-175 181-184 188-190 193- 
194 196 198-200 202 204-205 207- 
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Tissue Origin 


RNA Source 


Hyoeq 
Library Name 


SEQ ID NOS: 








208 


210 


214 


-215 218 


221- 


-226 229 








231- 


-232 


234 


-24 


1 245 


-247 


251-253 










2.Zi f 


-259 


268-269 


271 


276-281 








285- 


-2 86 


288 


290-292 


300- 


-302 304 








3 07 


3 09 


-311 


313 315 


317- 


-318 320- 








322 


32 5 


-326 


328 330 


-331 


333-338 








1 A 1 


344 


-347 


349 352 


354 


356-357 








3 62 


369* 


-373 


376 379 


-380 


382 364 








JO/ 


390* 


-391 


393-394 


397 


399-403 








4 05- 


411 


414 


-41 


5 417 


-420 


426-428 








437- 


438 


440 


-44 


4 453 


-455 


462 464 








467 


469- 


-471 


476 478 


482- 


484 488- 








491 


4 97 


503 


506-513 


516- 


517 520 








524- 


526 


528 


-53 


0 532 


-534 


537-540 








, 542 


544 


547 


-551 553 


551 


565-567 








572- 


574 


577 


581 585 


587- 


588 590- 








591 


597 


599 


60 


1-602 


606- 


610 612 








615- 


617 


619 


-62 


0 622 


-623 


628-629 








631 


633- 


•634 


63 


6-641 


643 


645-647 








651- 


653 


655 


-664 669 


-671 


673 679 








682 


687 


689 


691-700 


702 


706 710 








715- 


717 


720 


-721 725 


-734 


736-739 








742- 


743 


746 


750-752 


756 


758-759 








762- 


764 


766 


76 


8 773 


-778 


780-782 








784- 


785 


787- 


-789 794 


796 


799 802- 








803 


805 


811 


81 


4-815 


818 


825-826 






- 


834- 


837 


839- 


-84 


0 842 


-843 


856-859 








861- 


862 


865 


867-872 


874- 


875 881 








883- 


884 


387 


88 


9-892 


894- 


895 897- 








898 


901 


904 


90 


8 910 


912 


914 917 








919 


921- 


924 


92 


6-927 


930- 


932 935- 








941 


943 


945 


94 


9 953 


-954 


958 961- 








963 


967 


969 


97 


1 975 


977 


981-983 








986 


988- 


990 


992 997 


999- 


1002 








1004 


-1006 1008 


1012 


1018 


-1023 








1027 


1029-1031 


1035 


-1037 


1047- 








1048 


1053 1057 


1059 


1063 


1068 








1070 


1072-1075 


1077 


1081 


-1083 








1085 


-1093 1095 


-1096 


1108 


-1112 








1114 


-1125 1127 


1131- 


-1133 


1135- 








1138 


1142-1145 


1148- 


-1158 


1160- 








1163 


1167 1169 


1172 


1175 


1177 








1180 


1183-1188 


1191- 


•1195 


1199- 








1200 


1204 1206 


1211 


1213 


-1216 








1222 


-1223 1226 


-1227 


1229 


-1231 








1234 


-123 


5 1241 


-1242 


1244 


-1263 








1266 


1269-1271 


1276- 


1277 


1279- 








12 81 


1284-1286 


1292 


1294 


-1295 








1299 


1305-1309 


1312 


1314 


1316- 








1319 


1322 1324- 


-1327 


1330 


1332 








1334 


-1335 1339 


1344- 


1346 


1351 








1354 


-1355 1357-1358 


1365 


-1367 








1369 


-1370 1373-1374 


1376 


-1379 








1381 


-1384 1386- 


-1388 


1392 


1394 








1396 


-1397 1400 


1403- 


1407 


1410 








1414 


1419-1420 


1423 


1432 


-1433 








1435 


1437-1438 


1440- 


1442 


1446 








1448 


1453-14 


55 


1457 


1461 


1463- 








1464 


1466 1468 


14 71 


1477 


1480 








1482 


-1483 1496 


1502- 


1504 


1507- 








1509 


1513 1519- 


1520 


1524-1526 








1536 


1547 1549- 


1552 


1567 


1573- 








1574 


1578 1586- 


1589 


1597- 


-1598 








1601-1602 1605 


1607- 


1609 


1611- 








1617 


1619-1621 


1623 


1625-1626 








1635-1641 1643- 


1645 


1649 


1651 








1653 


1656-1658 


1664 


1669 


1671- 








1674 


1676-16 


84 


1686 


1689- 


1690 








1694- 


1696 17 


34- 


1705 


1708- 


1709 
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Tissue Origin I RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult brain 



1720-1724 1726-1728 1730-1733 
1737-1740 1742-1745 1753 1756- 
1757 1759-1761 1765 1767 1771- 
1772 1776-1777 1779-1780 1786 



Clontech 



ABR011 



adult brain 



BioChain 



24 75 103 186 210 310-311 364- 
365 508 623 710 937 1002-1003 
1059 1204 1609 1731-1732 



ABR012 



adult brain 



46 182-184 204-205 300 739 767 
1371 1549 1620 1684 



Invitrogen 



ABR013 



adult brain 



Invitrogen 



185 204-205 364-365 393 497 595 
687 692-694 830 845 1068 1320 
1413 1640 



ABR014 



adult brain 
adult brain 



187 301 357 364-365 375 454 463 
731 859 939 983 1073 1262 1270 
1320 1403 1640 1651 1657 1696 
1722 1738 



Invitrogen 



ABR015 



419 434-435 441-442 763 789 983 
1320 



Invitrogen 



ABR016 



adult brain 



312 364-365 379 1320 1334-lj35 
1674 1722 1785 



Invitrpgen 



ABT004 



14-16 22-23 25 37-3 
70-72 78 86 94 107 
137 143 146 152 161 
194 196 198 210 218 
295 298 309-310 320 
338 346-347 349-350 
371 379-380 382-383 
399 401 408 428 438 
482 490 502 507-509 
557 562 597 602 607 
655 667 669 671-672 
696 710 712 715 721 
750 753 766 778 780 
814 826 830 837 841 
894-895 925 937 949 
961 963 968-969 988 
1005-1006 1016-1019 
1037 1052 1086 1090 
1115 1120-1121 1123 
1137 1140 1144-1147 
1170 1174 1188 1193 
1225 1229 1231 1254 
1280 1285 1309 1312 
1341 1343-1344 1356 
1378-1379 1383-1384 
1423 1429 1434 1442 
1452 1454 1470-1472 
1525 1528-1529 1532 
1554 1557-1559 1561 
1585 1588 1590 1595 
1608 1610-1613 1615 
1627 1640 1644 1647 
1666 1670 1675 1696 
1723 1727 1738 1760 
1779 1785-1766 



9 43 58 60 
113 116 136- 
173 182-184 
229 259 267 
321 324 336- 
356-357 362 
391 393 396 
459 461 476 
516 526 531 
-609 624 652 
687-689 695- 
732 739 743 
781 789 803 
857 869 874 
954-956 960- 
989 1000 
1021 1036- 
1109 1113 
-1124 1136- 

1151 1167 
-1194 1205 
1258 1262 
1334-1335 
-1357 1370 
1403-1404 
1448 1451- 
1482 1499 
1536 1547 
-1562 1567 
1601-1604 
1619 1624 
1660 1664 
1704 1715 
-1761 1768 



cultured 
preadipocytes 



Strategene 



ADP001 



5-8 11 17 25 68-69 
105 110 116 136-138 
189 196-198 261 267 
301 318 331 336-338 
400 428 430-431 510 
527 549 557 561 602 
631 637 647 670 681 
748 782 793-794 817 
845 858-859 879 882 
960 982 986 995-996 
1005-1007 1025 1027 
1039 1045 1071 1078 
1102 1136-1137 1140 



80 82 87 103 
168 171 188- 
276 288 293 
379-380 391 
-512 520 524 
618 620 622 
-682 710 731 
834-836 843 
893-895 934 
1000 1002 
-1028 1032 
1097 1099- 
1219-1220 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adrenal gland 



1260 

1322 

1370- 

1437 

1602 

1660 

1711 

1760- 



1271 
1329 
1371 
1466 
1608 
1662 
1719 
1761 



1297"- 
1339 
1398 
1468 
1614 
1673 
•1720 
1765 



1298 

1345 

1408 

1533 

1631 

1687- 

1742 

1767 



1314 

1365- 

1423 

1539 

1649- 

1688 

1746 

1771 



1320 
1366 
1431 
1594 
1650 
1696 
1749 
1785 



Clontech 



ADR002 



adult heart 



4-10 15-16 25 29-31 43-45 47 50- 
51 55 60 62-63 65-66 75 80 102 
116 118 122 126 130 137 150 169- 
170 181 192 198 201-203 215 227- 
228 247 251 255 267-269 271 280- 
281 285 295 298 311 336-338* 342 
349 351-352 354 372-373 383-385 
391 400 410 415-416 424 426-427 
431 434-437 439 445 454 461 473 
477 483 491 493 497-498 503 516 
519 527 535 546 549 552 572-573 
581 588 595 600 602 608-610 620 
628-630 637 645-646 670 679 703 
713 715 719 732 734 744-746 758 
773-778 789 816 829 837 845 848 
869 875 883 898 904 912 922-923 
930-931 942 948 952 965 967 969 
976-977 981 990 992-993 1001 
1004 1049 1055 1059 1071-1072 
1076 1112-1113 1115 1121 1127 
1134-1135 1151 1158 1163 1175 
1181 1188 1209 1218 1224-1225 
1227 1231 1243 1270-1271 1274 
1280 1285 1290 1293 1307 1324- 
1325 1327 1330 1342-1343 1345 
1348 1365-1366 1369 1378-1379 
1387 1398 1400 1405 1417 1425- 
1426 1436 1440-1441 1444 1454 
1463-1464 1488 1491 1507 1512 
1538 1546 1567 1573-1575 1588 
1598 1609 1614 1618 1622 1624 
1627 1634 1636 1649 1651 1658 
1671 1674 1678-1679 1691-1692 
1703 1717 1727 1731-1732 1737 
1765 



GIBCO 



AHR001 



4-8 10-11 1 
46 50-52 57 
85 87 89 94 
110 112 114 
127 130-132 
147-151 153 
186 192 195 
215 220 225 
236 251 257 
277 280-282 
298-301 304 
325 330 333 
352 354 358 
384 387-388 
408-409 411 
433-439 445 
457 459 462 
4B3-484 487 
503 506 508 
526 534 536 
560-562 574 
587 589 593 
612 615-620 
645-652 656 
674-675 683 
701 709 712 



5-16 18-21 3 
-58 60 62-63 
97 100 103- 
116 118-119 
134 136-138 
163-164 168 
197 199 204 
-226 229-230 
260 262 265 
285-286 289 
307 309 314 
336-338 345 
361 368 370 
391 393 397 
-412 414-416 
-446 449 452 
469 472-473 
490 492-493 
510-513 516 
540 542 546 
577 581-582 
595 597 604 
622-623 626 
-660 665-666 
-684 687 692 
715-716 719 



4-39 44- 

71 75 82 
104 108- 
122-123 
141-144 
-171 179 
-205 212- 
232 234- 
272 274 
-292 296 
321 324- 
349 3S1- 
380 383- 
401 406 
430-431 
454-455 
476-480 
496-498 
519-522 
549 553 
584 586- 
-609 611- 
632 637 
670-672 
694 697 
-720 725- 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 






SEQ 


ID NOS: 










726 


728 7 


30-732 735 


738 - 


739 743- 








744 


746 7 


51 753 759 


761 


765 770- 








771 


775-7 


80 785 788 


-790 


796 802 








804 


810 8 


12 817 821 


826 


828 830 








837 


843 845-84 


7 849 


- 853 


857-861 








863- 


864 8 


69 871 875 


877- 


879 881 








883 


887 8 


90-892 894 


- 895 


897-898 








901 


903 9 


06-907 911 


-913 


915 919 








921- 


925 927-92 


3 933 


-935 


945 958 








961- 


963 9 


67 969-972 


975 


977-978 








980- 


986 990 992 999 


-1002 


1005- 








1007 


1010 


1016 


1019 


- 1020 


1022- ' • 








1023 


1025 


1028 


-103 7 


1039 


-1040 








1043 


1047 


1050 


1054 


-1055 


1057 








1059 


1063 


-1064 


1067 


-1068 


1070 








1072 


1075 


-1076 


1083 


1065 


-1087 








1089 


1093 


-1094 


1104 


1106 


110B- 








1109 


1113 


1116- 


-1117 


1119 


1121 








1124 


1126 


1128 


113 1 


„1 174 
llJl 


1144- 








1145 


1148 


-1149 


X X ox 


XXZ3 0 


1167 








1169 


-1170 


1175 


XX / f 


1192 


1196 








1199 


-1200 


1202 


ions 


-1208 


1211 








1216 


1218 


1222 


1227 


- 1229 


1232- 








1235 


1238 


-1241 


1243 


X £ft ¥ 


1247- 








1248 


1250 


1253- 


•1254 




-1258 








1261 


1268 


1270- 


1271 


X<5 f 1 


1280- 








1282 


1287 


1292 


1298 


i. £JZ) 


1306 








1308 


1317- 


-1321 


1324 




1330 








1332 


1334 


-1337 


1339 


1 J T 1 


-1345 








1349 


-1350 


1354- 


1356 


1359 


-1360 








1365 


-1366 


1369 


1371 


1374 


-1375 








1378-1380 


1383- 


1384 


1389 


1397 








1400 


1403 


1409 


1417 


1423 


-1426 








1437 


1439 


1442 


1444 


1446 


-1447 








1450 


1453 


1468 


1470 


1473 


1479 








1481 


1488 


1490 


1501- 


-1504 


1519 








1521 


1524 


1528 


1530- 


-1534 


1536- 








1537 


1539 


1541- 


1542 


1547 


1553 








1555 


1560' 


1565 


1567- 


-1571 


1588 








1591 


1597-1598 


1601- 


-1602 


1605 








1614- 


-1616 


1619- 


1620 


1623-1628 








1630- 


-1632 


1634 


1636 


1641 


1644- 








1645 


1647 


1649 


1652- 


1655 


1659 








1662 


1667 


1673- 


1674 


1680- 


-1681 








1684 


1686-1688 


1704-1705 


1709 








1711- 


-1712 


1717 


1724 


1726-1727 








1731- 


1733 


1737- 


1738 


1741 


1743- 








1744 


1749 


1754- 


1755 


1760- 


■1761 








1765 


1772 


1785 








adult Icidney 


GIBCO 


AKD001 


4-8 10-11 


17-21 


29-2 


1 35- 


•39 42- 








45 50-51 56-58 


60-61 64 68-69 75 








77 80 82 85 87 


92-94 


97 100 102- 








104 107-108 112 


116- 


117 119 123 








127-133 136-137 


139- 


141 143-144 








147-154 157 161 


-163 


165-166 169 








172 176 178-179 


192 


194-197 199 








201 203-206 209 


-210 


212-213 215- 








216 223-228 234 


-236 


238 247 251- 








253 257-259 261 


-262 


265-269 271- 








272 274 276-277 


279- 


281 2 


84-286 








290 293 296 298 


-299 


301-302 304 








307 311-313 321 


325- 


326 3 


29-331 








333 341 344 348 


-350 


352 356 358- 








359 3 


62 364-365 


368 


370-3 


72 374 








376-377 380-382 


392 


395 398 400- 








401 404 407-409 


414- 


415 423-424 








430-437 443-444 


446 


449 451 453- 








455 459 461-462 


464 


467 4 


69 471- 








474 4 


76-477 480 


-481 


483 4 


87-488 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



adult kidney 



490-491 493 497-505 
520 522 524 526-529 
544 547 549 554-556 
567 571-576 578 582 
593 598-599 601 604 
615-619 621-626 632 
645-652 655 660-664 
678-679 688 692-695 
713 717 719-720 727 
738 743 745-746 751 
763 765 771-773 775 
788 793 795-796 800 
810-812 814-819 821 
834-838 842-845 848 
864-865 867 869 871 
886-887 889-891 893 
902 906-908 910-914 
925-927 929-935 937 
948-949 951 953-958 
964 969-970 972 976 
988-990 992-993 995 
1004-1008 1010 1012 
1017 1019-1020 1022 
1035 1038-1040 1042 
1050 1054-1055 1057 
1070-1073 1078 1085 
1089 1092 1094 1097 
1107 1109-1112 1116 
1123-1125 1132-1135 
1143 1146-1147 1149 
1154 1157 1159 1163 
1178-1179 1181 1183 
1200 1202-1204 1206 
1219 1221-1222 1225 
1232-1234 1238-1241 
1246-1247 1253 1257 
1261 1267-1268 1270 
1281 1283 1287-1239 
1299 1306 1308 1311 
1320 1323 1329-1330 
1339 1341 1349-1350 
1359 1367 1369 1373 
1379 1394 1397 1400 
1407-1409 1417 1419 
1428-1431 1433 1437 
1443 1445-1446 1448 
1454 1459 1461 1465 
1475 1478.1484-1488 
1493 1495 1497-1498 
1509 1512 1518 1521 
1527-1528 1532-1533 
1541 1547-1550 1552 
1561 1565-1566 1568 
1578-1579 1583 1586 
1591-1592 1594 1598 
1604 1606 1608 1611 
1616 1618-1622 1624 
1632 1634-1636 1638 
1644 1646-1649 1653 
1664 1666-1667 1670 
1679 1683-1684 1686 
1696-1699 1701 1709- 
1714 1716-1719 1723 
1727 1733 1737-1738 
1744 1748-1749 1751 
1763-1768 1778 1780 



510-513 516- 
534 537-540 
560 562 564 
586-589 592- 

-606 608-613 

-634 637-643 
669-672 676 
698 702 711 
731 735-736 
753 755 762- 

-778 780 786 
803 805 808 
826 829 832 

-855 857-861 
B74 876-883 

-896 898-900 
918 920 922 
940-942 945 
960-961 963- 

-978 982-986 

-997 999-1002 

-1013 1016- 
1025-1031 
1044 1047 

-1064 1068 

-1086 1088- 
1099-1102 

-1119 1121 
1140 1142- 

-1150 1153- 
1167 1170 
1192 1196- 

-1211 1216- 
1227-1230 
1243-1244 

-1258 1260- 
1272-1274 
1293-1295 
1313 1317- 
1334-1335 
1353-1357 
1375 1378- 
1403 1405 
1423-1424 
1438 1442- 
1450 1453- 
1468 1474- 
1490 1492- 
1506-1507 
1522 1525 
1537 1540- 
1556-1559 
1571 1575 
1587 1589 
1600 1603- 
1613 1615- 
1628 1631- 
1639 1641 
1656 1662 
1671 1676- 
1691-1692 
1711 1713- 
1724 1726- 
1741 1743- 
1760-1761 
1785 



Invatrogen 



AKT002 



20-21 37-39 47 52 57 60 65-66 
68-69 80 104 107-108 122 130 133 
136-137 140 142-143 149 169 174 
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Tissue Origin 



adult lung 



lymph node 



RNA Source 



GIBCO 



Hyseq 
Library Name 



ALG001 



SEQ ID NOS: 



181 197 227-228 235-236 244 251 
261-265 267 280-281 286 290 299 
301 304-305 309 312-313 339 341 
344-345 349 358 370-372 376 382- 
383 387 392 401 414 416 421 430 
443 445 449 453-454 472 487-488 
504 506 513 516 519 522 528 536- 
540 546 554 585 587 594 598 602 
607 616-617 626-627 636 643 662- 
664 695 709 721 735 743 761 768 
775-777 788 796 804 814 827 B37- 
838 849-850 852-853 869-870 881 
890-892 898 903 905-907 914 919 
925 927 934 941 949 952 957 960 
962 968 970 1O00 1008 1029-1030 
1044 1052 1055 1063 1067-1068 
1073 1085 1099-1102 1107 1110- 
1111 1113 1115 1119 1126 1134 
1136-1137 1146-1148 1153 1159 
1192 1196 1199 1232-1233 1241 
1256 1264 1272-1273 1281 1285 
1293-1294 1299 1312 1320 1324- 
1325 1330 1344 1349 1351 1355- 
1356 1369 1378-1379 1403 1414 
1419 1428-1429 1436 1446 1458 
1463-1464 1467-1468 1470 1477- 
1470 1486 1491 1509 1519 1527 
1529 1534 1547 1596 1600 1619 
1623 1629 1631 1634 1638 1643 
1647 1652 1660 1664 1667 1669- 
1670 1673 i686 1709 1727 1740 
1776 



4-8 14 37-39 44-46 50-51 56 62- 
63 75 82 88 93 103-104 113 125 
133 140 143 150 152 154 157 162 
171-172 174-175 190-191 196 200 
211 214 219 223-224 227-228 251- 
252 256 265 272 274 280-281 285 
310 332 345 351 362 371 381-382 
394 408-409 431 436 445 454 459 
461 467 469 471 476-477 488 504 
513 527 537-540 544 547-548 554 
564 583 607 616-617 621 623-624 
634 645-646 662-664 670 695 716 
719 743-744 763 766 774 789 803 
811 814 817 831-832 837-838 845 
852-853 858-859 861 866 880 887 
901 905 941 954-957 966 971 977 
979 981 987 990 992 996 1001 
1005-1006 1014 1017 1045 1047 
1054 1059 1062 1064 1072 1080 
1086-1089 1094 1107 1126 1134 
1136-1137 1142 1150 1157 1173 
1190 1200 120B 1220 1241 1272- 
1273 1280 1282 1295 1306 1320 
1331-1332 1353 1374 1379 1383- 
1384 1404 1409 1423 1434 1436 
1442 1474 1478 1494 1509 1522 
1525 1531-1532 1547 1549 1553- 
1554 1571 1598 1606 1613 1624 
1627-1629 1632 1642 1644 1662 
1669 1676-1677 1684 1696 1727 
1731-1732 1737-1738 1748-1749 
1786 



Clontech 



ALN001 



4 24 50-51 62 105 137 153 198 
201 223-224 234 268-269 272 280- 
281 287 301 312 329 343 382 421 
430 433 445 451 461-462 475 481- 
482 503 526 529 537-540 546-547 
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Tissue Origin 



young liver 



adult liver 



RNA Source 



GIBCO 



Invitrogen 



Hyseq 
Library Name 



ALV001 



ALV002 



SEQ ID NOS: 



621 626 649 679 719 725-726 738 

793 803 831 834-836 838 844 857- 

858 866 873 905 913 928 963 976 

1005-1006 1012 1038 1050 1116- 

1117 1151 1199 1204 1226 1243 

1265 1274 1324-1325 1339 1353 

1374 1377 1440-1441 1447 1504 

1549 1600 1618-1619 1631 1641 

1644 1653 1687-1688 1691-1692 
1741 1771 



5-8 11 20-21 46 50-51 58 65-65 
75 79 82 93 97 102-103 108 110 
116 139 143-144 148-149 171-172 
174 187-189 194-195 198 209 214- 
215 230 250 25B 267-269 280-281 
306 309 342 351 356 359 362 372 
374 392 394 398 401 407-408 410 
414 431 444 455 459 476 478 483 
493 510-512 516 520 522 526 536 
549 571 574-577 585 592 601-602 
607 621-624 628-630 632-633 637 
648 660 666-667 678 697-698 700 
717 719 728 730 734 738 744-745 
766 770 773 779 788 800 808 812 
814 841 849-851 871 874 879 887 
893 898-900 902-904 906-907 911 
919 922 924 934 953 957 963 965 
970 984 986 997 1001 1004 1007 
1012 1029-1030 1033-1034 1052 
1061 1066 1070 1076 1086 1089 
1093 1099-1102 1110-1112 1116- 
1117 1119 1121 1125 1136-1137 
1144-1145 1156-1157 1159 1196 
1199-1200 1209 1211 1219-1220 
1241 1244 1262 1270 1275 1279 
1283 1295 1317-1320 1332 1339 
1344 1359 1362-1363 1379 1383- 
1384 1403 1415 1430-1431 1437 
1450 1467 1475-1476 1483-1484 
1494-1495 1498 1505 1512 1516 
1518-1519 1526 1529 1547 1550- 
1552 1557-1559 1565 1583 1587 
1S97 1609 1614 1620 1631 1637 
1641 1644 1654-1655 1662 1667 
1669 1684 1691-1692 1702 1711 
1725 1738 1741 1743-1744 1758 
1760-1761 1763-1765 1769 



5-8 17 20-21 32-33 41 55 58 64 
75 77 86 89 102 108 117 119 175- 
176 198 200 209 231 235-236 250 
272 275-276 284 306 316 321 325 
333 356 359 374 376 398 401 408 
414 428 430 433-435 454 476 494 
503-505 517-518 528 534 544 552 
561-563 567 578 581 608-609 630 
632 637 644 650 661 665 672 702 
707 710 721-722 750 753 778 782 
794 814 820 826 834-837 847 849- 
850 858 861 B74 879 893 898 904 
911 918 921-922 926 946 948 972 
978 986 996 1020 1027 1031 1034 
1053 1063 1068 1070 1073 1086 
1089 1093 1097 1113 1119 1156 
1159 1195 1198-1199 1208 1220 
1227 1241 1261 1272-1273 1277 
1285 1308 1315 1320 1324-1325 
1330 1362-1363 1375 1403 1408- 
1409 1415 1431-1432 1435 1467 
1469 1482 1504 1524 1542 1547 
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Tissue Origin 



adult liver 
adult ovary 



RNA Source 



Clontech 



Invitrogen 



Hyseq 
Library Name 



ALV003 



AOV001 



SEQ ID NOS: 



1550 1567 1578 1581 1583 1594 

1597 1G01-1602 1611-1612 1615 

1618-1619 1621 1625 1637 1645 

1647 1652 1654-1655 1660 1656 

1669-1671 1684 1706 1722 1737- 

1738 1742-1744 1760-1761 1763- 
1765 1772 1774 



29 676 997 1063 1119 1536 1766 



1 4-18 20-23 29 35-40 42-48 50- 
51 53-58 61-63 65-66 68-69 73-75 
77-78 80 82 85 87 89 97 100-101 
103-104 106-108 110 113 115 118 
122-124 126 12B 133-134 136-140 
142 145-147 149-157 161 166 168- 
170 174 177-178 180 182-186 188- 
189 192-203 207 209 211-215 219 
221-224 229-230 234 242-243 246- 
247 255 258 260-262 265-269 271- 
272 274 277-281 264-2B6 288 290 
295 299 301-302 304 307 309-311 
313-314 316 321 323-326 330 332- 
333 335-338 341 344 349 352-353 
356 358 360 362 370-372 376-377 
379-384 387 390-392 394 397-398 
400 403 408-410 412 414-416 423- 
424 426-427 430-435 439 443-446 
448-449 451 453-455 462-463 468- 
471 473 476-479 481-484 487 489- 
494 496-497 499-501 503-505 509- 
514 516-517 519-520 522 524 526 
528-534 541-544 546-547 549 552 
554-555 561-564 566-567 569-570 
572-573 575-576 579 581 583 585- 
588 590-591 593 595 597 599 601- 
605 607-613 615 618-622 624-627 
630 632-633 636-640 642 644-647 
649-652 654-655 657-665 667-675 
677-678 681 683-684 692-695 697- 
710 714-721 723 725-727 729 732 
734-735 743-746 750-751 753 758 
763 765 767 772-773 775-778 7B0 
783-784 786 788 790-791 794-796 
800 803 805 809-811 813-815 818- 
819 821-824 826 828-829 831-832 
837-838 843-850 852-857 859-864 
867 B69 871-872 874-875 878-883 
887-888 890-895 898-910 912-914 
916 919-922 924 926-927 929-939 
941 943-946 948-951 953 955-958 
961-964 966-967 970-979 981-982 
985-986 988-990 992 995-997 999- 
1001 1004-1009 1011-1013 1016 
1019-1020 1024-1025 1029-1031 
1033-1035 1037 1039 1041-1047 
1050-1051 1054-1060 1062-1064 
1067-1070 1072-1073 1075-1076 
1078-1079 1085-1086 1089-1090 
1094-1096 1098-1103 1106-1108 
1112-1117 1119-1120 1123-1127 
1131-1135 1142-1143 1146-1149 
1153 1156 1158 1163 1165-1166 
1169-1171 1173-1175 1177-1178 
1180 1183-1185 1190-1191 1195 
1197-1200 1202 1205-1214 1217- 
1219 1221-1226 1232-1235 1238- 
1241 1243-1244 1247 1249 1252- 
1254 1256-1258 1262 1265 1267- 
1268 1270 1275 1278 1280-1283 
1286-1289 1291 1293-1294 1298- 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 






SEQ 


ID NOS; 










1299 


1306 


1308 


1312 


1317 


-1321 








1323 


1327 


1329- 


1330 


1332 


-1333 








1338 


-1339 


1341 


1343 


-1351 


1356 








1359 


1361 


1365- 


1366 


1371 


-1375 








1377 


-1379 


1383- 


13 84 


1386 


1389 








13 94 


1400 


1404 


14 16 


- 14 17 


1422- 








1427 


1429 


-1431 


143 5 


-14 3 6 


1439- 








1443 


1445 


-1450 


14 53 


-1454 


1459 








1463- 


-1464 


1466 


1468 


1470 


1474- 








1481 


1484 


-14B5 


1488 


14 91 


1493- 








1494 


1496 


-1498 


1501 


-1504 


1506- 








1507 


1511 


-1517 


1519 


1521 


-1524 








1526 


-1527 


1530- 


1531 


J.jJ'i 


-1536 








1538- 


-1539 


1541 


1546 


XO*± O 


-1550 








1553 


1555 


-1559 


1561 


-1563 


1566- 








1567 


1569 


-1570 


1572 


1574 


-1575 








1578 


1580 


-1581 


1587 


-1588 


1590- 








1591 


1595 


1597- 


1598 


1600 


-1606 








1609 


1611 


-1621 


1623 


-1630 


1634 








1636 


1638 


1641 


1643 


1645 


1647- 








1657 


1659 


-1662 


1664 


1667 


1669- 








1671 


1673 


-1674 


1676 


-1681 


1683- 








1690 


1699 


1702- 


1707 


1710- 


-1711 








1713- 


•1714 


1716- 


1719 


1723- 


-1724 








1726- 


•1728 


1731- 


1733 


1735 


1737- 








1738 


1740-1741 


1743 


-1744 


1748- 








1751 


1753 


1755- 


1756 


1760- 


-1762 








1765 


1767-1768 


1770 


-1771 


1776 








1778- 


1779 


1783- 


1784 


1786 




adult placenta 


Clontech 


APL001 


5-8 44-45 


90-91 


107 


-108 159 178 








311 351 414 476 


503 


545 574 624 








636 719 755 773 


860 


890-891 924 








947 955-956 962 


990 


992 1002 








1045 


1202 


1320 


1369 


1628 


1686 








1713- 


1714 


1743- 


1744 






placenta 


Invitrogen 


APL002 


14-16 26 29 43 


60-61 79-80 103 








106 116 135 171 


177 


180 194 196 








198 210 216 235 


-236 


272 290 299 








309 3 


29 334 339 


359 


379-380 417 








423 430 434-435 


448 


454 483 490- 








491 517 522 631 


723 


725-726 728 








738 746 769 818 


843 


854-855 857- 








858 916 948 953 


-954 


976 988-989 








1005- 


1006 


1013 


1033 


1036 


1064 








1068 


1070 


1086 


1139 


1144- 


1145 








1160 


1277 


1285 


1317-1320 


1343 








1345 


1429 


1435 


1438 


1454 


1482 








1486 


1490 


1512 


1519 


1532 


1549 








1592- 


1593 


1602 


1626 


1647 


1649 








1664 


1673 


1675 


1722 


1727 


1730 








1746 


1776 










adult spleen 


GIBCO 


ASP001 


3 5-8 


12 15-16 


19-21 24 29 34-36 








44-45 


57 60 82- 


83 87 89 94 98-99 








103 106 108 117 


119- 


121 139 141 








147 152-153 155 


166 


169 171 174 








178-1 


80 196 198 


201- 


206 209-211 








215 219 234 253 


-254 


256 258 264 








272 280-281 290 


295 


302 309 312 








325 333 341 349 


358 


372 382 386- 








387 3 


94 406 414 


431 


434-4 


36 446 








448 451 473 481 


490- 


493 500 503 








505 517 519 530 


534 


536-540 547 








554 557 574-576 


582 


592 595 604 








611-612 620-621 


623 


631-632 642 








652 659 661 667 


671 


673-675 684 








700 721 728 730 


732 


738 742-744 








746 762 765 774 


780 


788-789 794 








810-811 817 822 


830 


832 845 848 






| 852-853 858 862 


866 


874 B79 882 



115 



WO 01/53312 



PCT7US00/34263 



Tissue Origin 



RNA Source 



Hyseq 
Library Name 



3EQ ID NOS: 



884 906-908 912 919 
927 934 942 949 957- 
978 983 990 992-994 
1005-1007 1010 1012 
1042-1044 1046 1049 
1070 1076 1089-1090 
1109 1113 1115 1124 
1170 1174 1177 1190 
1220 1226-1227 1229 
1246 1258 1269 1271 
1301 1320 1322 1330 
1339 1349 1351 1353 
1364 1369 1374 1386 
1417 1434 1436-1437 
1474 1477 1480 1485- 
1512 1522 1525 1544- 
1560 1567 1591 1600 
1651 1654-1655 1658 
1674 1678-1679 1684 
1727 1733 1738 1740- 
1761 1774 1779 1781- 



921-923 926- 
958 963 977- 
996-997 999 
1031 1036 
1059 1068 
1094 1103 
1140 1163 
1196 1219- 
1236 1241 
1274 1295 
1334-1335 
1359-1360 
1397 1413 
1439 1468 
1487 1498 
1549 1553 
1631 1636 
1662 1670 
1686 1700 
1741 1760- 
1782 



testis 



GIBCO 



ATS001 



5-8 10 26 30-31 47 50-51 57 68- 
69 82 84-85 97 102 113 119 137 
139 150 152 154 156 163 169 174 
176-177 192 194 196-197 212-215 
227-228 247 255 258 261 282 285 
288-289 301 307 311 316 330 334 
349 370-372 392 398 410 415 426- 
427 430-431 433 437 446 454 461 
469 473 477 481-482 493 499 502- 
503 513 522 526 547 552-553 563- 
564 572-573 575-576 581-582 5B5 
599-602 605 612 615-617 620 631 
637 647 649-650 656 660 665 670 
674-675 712 719-721 723 728 731 
738 744 746 773 780 784 78B-7B9 
802 804 809 811 814 826 831 837 
843 845 848 859 866 869 877 905 
913 916 919 921 926 929 937 950 
960 963 971 975 977 981 990 992- 
993 1007 1016 1029-1030 1034- 
1035 1038-1039 1045 1059-1060 
1064 1070 1072-1073 1087 1089 
1097 1099-1102 1104 1108 1113 
1141 1149 1161-1162 1175 1208- 
1209 1222 1227 1229 1231 1235 
1238-1239 1243 1253 1285 1287- 
1289 1291-1293 1307 1311 1317- 
1320 1330 1332 1338 1345 1369 
1373-1374 1379 1389 1399-1400 
1409 1423-1424 1430 1435-1437 
1443 1459 1484 1486 1490 1493 
1496-1497 1501 1505 1509-1513 
1527 1530-1531 1533 1537 1546 
1549 1563 1565 1567 1569 1571 
1577 15B6 1591 1599 1602 1625 
1628 1630-1632 1636 1639 1642 
1649 1661-1662 1666-1667 1670 
1675' 1684 1690 1699 1705 1712 
1717 1724 1730 1737-1738 1752 
1767 1779 



Genomic DNA 
from BAC 63118 



Research 
Genetics 
(CITB BAC 
Library) 



BAC001 



68* 13*2 1412 



Genomic DNA 
from BAC 39316 



Research 
Genetics 
(CITB BAC 
Library) 



BAC002 



1411-1412 
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Tissue Origin 



Genomic DNA 
from BAC 39316 



adult bladder 



RNA Source 



Research 
Genetics 
(CITB BAC 
Library) 



Hyeeq 
Library Name 



BAC003 



SEQ ID NOS: 



1352 



Invitrogen 



BLD0 01 



5-8 17-18 22-23 33 37-39 56-57 
80 93 100 120-121 169 201 237 
251-252 272 278 311 348 363 382 
413 415 424 430 443 483 502 542- 
543 562 564 607 616-617 626 635 
652 667 671 710 727 755-756 762 
773 786 788 837 840 866 893 898 
909 918 929 966 977 983 1016 
1025 1055* 1073 1082 1140 1167 
1185 1189 1199 1270 1369 1481 
1536 1560 1573 1596 1614 1636- 
1637 1649-1650 1654-1655 1658 
1669 1671 1690 1719 1727 1731- 
1732 1739 1741 1760-1761 1779 



bone marrow 



Clontech 



BMD0 01 



3-8 11 13 18 29-31 33 35-36 40 
43-45 47-48 50-51 57 60 65-66 75 
80 82 85 88-89 94 100 103 107 
110 115 118-119 124-125 133-134 
136-137 139-141 146 150 152-153 
155 161 163 168-170 172 178-180 
187 192-193 197-198 203-205 210- 
213 215 217 219 222 224-226 233 
235-237 242-244 255 258 260 263- 
264 266 273 276 278 283 286 290 
295 301-302 307 312-313 321 330 
333 339 343 352 357-358 370-371 
382 384-385 387 389 394 408 410 
412 416 421 424-427 429-431 436- 
437 439 441-442 445 447 454-456 
461-462 471-472 475 477-479 481- 
482 485 488 493 498 500 503-506 
513 516 519 523-524 526 530 535- 
540 542 544-545 549 555 565 567 
569-577 581 583-586 5B8 593 601 
603-604 608-609 613-619 621-622 
632-633 636-637 642 649-650 656- 
660 666 670 672 674-675 679 683 
701 708 716 718-720 731 735-736 
740-742 744-745 752 761 765 772- 
773 775-778 780 785-786 789-791 
796 798 802 810-812 823-824 826 
830 832-833 837-838 843-844 848- 
855 858-859 866-867 869 878-880 
883 890-892 896 903 905 908 912- 
914 922-924 927 930-931 937 939- 
941 952-953 955-958 963 969 973 
976 981 985 987 990 992 995 1000 
1002 1005-1007 1013 1016 1025 
1028-1031 1033 1035 1037 1039 
1042 1044 1047 1050 1053-1054 
1059 1061 1063 1066 1070-1071 
1079 1106 1110-1113 1115-1117 
1124 1126 1134-1135 1142 1144- 
1145 1163 1172 1178 1197 1199- 
1200 1202 1216-1217 1224 1227- 
1228 1240 1246 1254 1261 1266 
1270 1278 1281 1285 1287 1290- 
1291 1293 1299-1301 1308 1314 
1317-1320 1327 1331 1339 1343 
1346 1349 1353 1356 1361 1367 
1369 1372-1374 1379-1380 1394 
1400 1403 1406 1408 1413 1417 
1419 1423 1425-1427 1430-1431 
1433 1439 1443 1446-1449 1459 
1463-1464 1482 1486 1493-1494 
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Tissue Origan 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1506 
1526 
1546 
1557- 
1592 
1626- 
1638- 
1653- 
1684 
1713- 
1727 
1772 



1509 
1528 
1548 
1559 
1597 
1628 
1639 
1655 
1686 
1714 
1737 
1781 



1513 
1531 

•1549 
1571- 

•1600 
1630- 
1641 
1661- 
1690 
1717 

•1738 

•1782 



1521- 

1536- 

1552 

1572 

1609 

1632 

1646- 

1662 

1702 

1720 

1740 

1785- 



1522 
1537 
1554 
1581 
1614 
1634 
1647 
1676 
1707 
1722 
1758 
1786 



1524 
1543 

-1555 
1589- 
1621 
1636 
1651 

-1681 
1711 

■1723 
1767 



bone marrow 



Clontech 



BMD002 



11 15-16 19 30-31 35-36 68-69 75 
83-84 93 99 103 108-109 118 137 
139 169-170 174 177 180 190 193 
212-213 219 222 225-226 232 237 
255 259 264 273-274 284 286 290- 
292 295 301 303-304 307 312-313 
316 324 326 330 334-335 348 352- 
353 357 360 370-373 384 386-387 
397 403-404 414-416 421 425-427 
429-430 433-436 440 444 451 454 
465-466 472 475 478 491 493 516 
520 523 525 531 545 548 552 566 
569-570 581 583 590-591 597-598 
601 616-617 621 641 650 652 656 
659 671 674-675 679 684 710 718- 
719 728 734 737-738 742 761 765 
774-778 790 811 814 818 B30 834- 
836 854-855 859 866 869 871 878- 
879 884 889 892 904 922-923 932 
990 992 998 1001 1004 1016 1036 
1042 1048 1051 1054-1055 1058 
1088-1089 1106 1112-1114 1155 
1157 1192 1200 1223 1227-1228 
1236-1237 1260-1261 1282-1283 
1285 1287 1295 1314 1317-1321 
1324-1327 1330 1333 1341 1343 
1347 1350 1353 1355-1357 1367 
1369-1370 1373 1377 1379 1381 
1383-1384 1394 1397 1400 1406 
1413 1417 1425-1427 1438 1442 
1446 1459-1460 1470 1493 1505 
1521 1536 1546-1549 1560 1573- 
1574- 1578 1598-1600 1621 1626 
1631 1634 1646 1649 1653 1656 
1658 1669-1670 1683-1684 1687- 
1688 1690-1693 1696 1699 1702 
1704 1707-1709 1711 1720 1722- 
1723 1725 1727 1729 1731-1733 
1738-1740 1743-1746 1752 1755 
1760-1761 1767 1777 1781-1782 
1786 



bone marrow 



bone marrow 
adult colon 



Clontech 
Clontech 



73-74 503 922 1036 1711 



95-96 866 1320 1475 



Invitrogen 



CLN001 



17 56-58 103 110 117 144 150 171 
179 185 188-189 201 204-206 210 
218-221 225-226 231 237 251 277 
288 310 312 320 333 359 386 388 
394 408 420 455 481 485 503 510- 
512 590-591 615 635 647-648 665 
672 684 697 710 725-726 743 780 
786 788 826-B27 848-850 854-855 
858 866 872 898 918 921-923 953 
976 983 993 1005-1006 1017 1020 
1025 1027 1054-1055 1063 1068- 
1069 1140 1153 1170 1185 1196 
1199 1220 1280 1314-1315 1320 
1345 1351 1355 1369 1428 1439 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1462-1464 1512 1556 1583 1587 
1594 1596 1614 1625-1626 1631 
1639 1645 1650 1675-1677 1687- 
168B 1701 1713-1714 1724 1740 
1765 



Mixture of 16 
tissues - 
mRNAs 



Various 
Vendors 



CTL016 



401 1490 1686 



Mixture of 16 
tissues - 
mRNAs* 
adult cervix 



Various 
Vendors 



CTL021 



312 782 1132-1133 1403 1712 1715 



BioChain 



CVX001 



1 4-8 11 13 18-21 25-26 30-31 33 
37-39 43 46-47 58 61 64-66 71 
73-74 82 85 94 100 103-104 113 
118 122 126 130 134 140 147 153- 
156 163 170 179 181 186 192 195- 
196 198 201-202 218-219 222 229- 
231 257 266 276-277 285-286 288 
298 301-302 304 307 312-314 324 
326 329-330 332 335 342 352 358 
362 371-372 376 379 381-382 384 
388 398 400 410 414 416 419-420 
426-427 430-431 433-436 439 446 
448 461-462 464 471-477 479 482- 
483 491 493 496 503 506 510-513 
516-517 526 530 535 542-544 546- 
547 557 561 572-573 575-577 581- 
582 585-586 588-589 593-594 600 
602 604-605 607-609 612 615-619 
623 644 650 654 657-658 662-665 
670 672 680 683 691-694 698 706 
708-709 711 713 720-721 727 729 
731-732 737 745-747 753-754 760 
765 771 774-777 780 790 793 796 
798 800 803 805 818 826 828 831- 
832 834-836 843 847-848 851-855 
857-860 864-866 869 871 876 878- 
880 882 887 890-891 897 899-902 
905-908 912-913 916 9X8-919 922 
927 932 934-938 944 948 955-956 
958 963-964 967 969-970 972 976 
978-979 983 985 990 992 1000 
1005-1007 1016-1017 1024 1027 
1033 1036 1038 1045 1047 1053- 
1056 1066-1067 1071 1073 1075 
1079 1082 1098 1113 1124 1129 
1134 1139 1146-1149 1163 1167 
1170 1173 1175 1177 1181 1197 
1200 1202 1211 1214 1216 1221- 
1222 1225 1227 1232-1234 1240- 
1241 1243 1258 1264-1265 1268 
1270 1279 1287-1290 1308 1310- 
1311 1316 1320 1323 1327 1345 
1349 1353-1354 1360 1372-1374 
1383-1384 1386 1394 1397 1405- 



* The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult brain 
mRNA (Invitrogen), 2) normal adult kidney mRNA (Invitrogen), 3) normal aduir liver 
mRNA (Invitrogen), 4) normal fetal brain mRNA (Invitrogen), 5) normal fetal kidney 
mRNA (Invitrogen), 6) normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA 
(Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone marrow mRNA 
(Clontech), 10) human leukemia Jymphablastic mRNA (Clonrech), 11) human thymus 
mRNA (Clontech), 12) human lymph node mRNA (Clontech), 13) human spinal cord 
mRNA (Clontech), 14) human thyroid mRNA (Clontech), 15) human esophagus mRNA 
(BioChain), 16) human conceptional umbilical cord mRNA (BioChain). 
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Tissue Origin 


RNA Source 


Hyseq 








SEQ 


ID NOS: 










Library Name 
























140 


5 14 


16 1425- 


14 27 


143 


1 1436- 








1437 1442 1446 


144 8 


1453 14 


59 








1466 14 


72 1 


478 


1482 


1496 1501- 








1503 1506 1512 


1522 


1527-1528 








1531 1533 1541 


1547 


1569 1571 








1585 15 


89 1 


597- 


159B 


1600 1608- 








1609 1614-1616 


1620 


1623-16 


24 








1626-1628 1630 


163 8 


1641 1643 








1649 1653 1656 


1662 


1667 1669 








1674-1675 1683 


1685 


-1688 1699 








1702 1709-1710 


1715 


1717 1722 








1724 1729 1731- 


1732 


1735-1739 








1741 1743-1744 


1748 


-1749 1755 








1760-1762 1767 


1773 


1778 1785- 








1786 














diaphragm 


BioChain 


DIA002 


137 


282 


289 


730 


780 


986 


1409 








1478 1599 1614 










endothelial 


Strategene 


EDT0 01 


3 5- 


■10 


13 1 


5-21 


24- 


26 29 34 


37- 


cells 






39 42 44-45 


50- 


51 53-55 


57- 


58 








60-61 65-66 


68- 


69 7 


3-74 


77- 


78 eo 








82-83 85 87 


89 


93-9 


S 101-105 108 








110 


112 


-114 


116 


118 


-122 


124 


128 








133 - 


•134 


137 


-142 


147 


-150 


152 


-153 








161- 


•163 


166 


-172 


176 


-179 


187 


190 








192 


194 


196 


-201 


204 


-207 


210 


212- 








214 


220 


224 


229 


-230 


233 


235 


-236 








240- 


241 


251 


-252 


258 


261- 


262 


265 








267- 


269 


272 


276 


-277 


279- 


281 


284- 








285 


288 


290 


295 


-296 


301- 


302 


310- 








311 


313 


316 


321 


325 


329 


331 


-333 








335 


340 


342 


351 


-355 


360 


371 


375 








380- 


382 


384 


387 


390 


392 


397 


400 








407- 


40B 


410 


412 


414 


416 


425 


-427 








431 


434- 


-436 


439 


444 


-445 


449 


454 








463- 


464 


472- 


-475 


477- 


-479 


486 


488- 








490 


497 


-49B 


500 


-504 


510- 


513 


516- 










522 


524 


526 


-528 


532- 


534 


536- 








540 


542- 


-546 


548 


561- 


•563 


566 


-567 








572- 


576 


579 


581 


585- 


-586 


589 


593 






• 


595 


597 


599 


603 


607- 


-612 


615 


-617 








620 


622 


626 


630 


632- 


634 


638 


-641 








644 


647 


656- 


•660 


662- 


•664 


670 


673 








678 


680- 


•682 


692 


-697 


707 


709- 


-710 








712- 


713 


719 


730 


732 


734 


736 


738 








743- 


74 6 


751 


759 


768 


771 


773 


775- 








778. 


783 


786- 


789 


793 


800 


803 


805- 








B07 


810- 


811 


814 


816- 


816 


821- 


-622 








824 


826 


828- 


829 


832 


834- 


838 


842- 








845 


848- 


850 


854 


-860 


862 


864 


869 








871 


874 


876- 


879 


883 


885 


887 


890- 








891 


894-895 


898 


-900 


903 


908 


910- 








913 


916 


919- 


922 


924 


926- 


928 


930- 








935 


939 


943 


948- 


•949 


951- 


954 


957 








959- 


961 


964 


969- 


-970 


973 


975- 


978 








983- 


904 


988- 


990 


992- 


993 


996- 


997 




- 




1000 


1002 1004-1013 


1016 


-1020 








1022 


-102 


5 1028 1031 


1033 


-103 


4 








103B 


-104 


6 1050 1055- 


1056 


1059- 








1060 


106 


2-10 


64 1067- 


1070 


1072- 








1074 


1076 10 


78 1082 


1086 


-10S 


7 








1089 


-1090 1093-1097 


1099 


-11C 


3 








1107 


110 


9-1113 1116- 


1117 


112 


4- 








1126 


112 


8-11 


31 1134- 


1135 


113 


8 








1140 


114 


4-1145 1148- 


1149 


1153 








1157 


116 


0 11 


63 1171 


1183 


-118 


4 








1198 


-119 


9 1202 1205- 


1207 


1211 








1216 


-121 


7 1219 1221 


1225 


122 


9 








1232 


-123 


5 1238-1241 


1243 


-1244 








1246 


1250 1253 1257- 


1258 


126 


1 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1265- 

1277 

1290 

1317- 

1330 

1345- 

1367 

1400 

1424 

1440 

1468 

1491- 

1511 

1531 

1547 

1561- 

1579 

1592 

1615 

1631 

1650 

1669 

1696 

1719 

1736 

1755 

1771- 



1266 
1280- 
1293 
1320 
1334- 
1347 
1369 
1406 
1426 
1442 
1472 
1493 
1516 
1536- 
1549 
1565 
1581- 
1597 
1618- 
1634 
1652- 
1671 
1698 
1722- 
1739- 
1760- 
1773 



1268 


1270-1271 


1274- 


1283 


1285- 


-1286 


1288- 


1295 


1298 


1308 


1312 


1324- 


-1325 


1327 


1329- 


1335 


1338 


1342- 


1343 


1350 


1355- 


-1356 


1359 


1374 


1376 


1379 


1398 


1408 


1414 


1417 


1419 


1428- 


-1431 


1434-1438 


1448 


1450 


1462- 


•1466 


1474 


1478 


1487- 


1488 


1501-1504 


1506 


1509 


1520- 


-1521 


1526 


1529 


1537 


1539- 


-1540 


1546- 


1552 


1555 


1557- 


1559 


1568 


1571 


1575 


1578- 


1583 


1587- 


-1588 


1590 


1605- 


•1606 


1611 


1613 


1621 


1624- 


•1628 


1630- 


1636 


1638 


1641 


1643- 


1659 


1664 


1666- 


1667 


1675- 


•1681 


1683- 


1688 


1703 


1711 


1715- 


1716 


1723 


1726 


1731- 


1733 


1741 


1743- 


-1744 


1749 


1761 


1765 


1767- 


176B 


1776 


1779 


1783- 


1786 



Genomic clones 
from the short 
arm of 
chromosome 8 
esophagus 



Genomic DNA 
from 
Genetic 
Research 



EPM001 



286 6 
1411- 



86 12 
1412 



97 1303-1304 1352 
1754 



BioChain 
Clontech 



ES0002 



131-132 261 289 380 503 860 092 
1000 1007 1397 



fetal brain 



FBR001 



fetal brain 



62-63 89 112 126 194 322 336-338 
379 391 411 481 546 563 607 679 
710 867 1012 1031 1055 1251 1262 
1320 1407 1643 1652 1686 1731- 
1732 1746 1765 



Clontech 



FBR004 



fetal brain 



68-69 90-91 139 212-213 301 331 
362 374 403 436 611 645-646 659 
668 670 691 785 805 845 1163 
1209 1216 1232-1233 1238-1239 
1387 1410 1416 1430 1496 1536 
1547 1593 



Clontech 



FBR006 



5-9 25 43 60 62- 
00 07 92 101 103 
149 152-153 157 
207-208 210 212 
238 251-253 266 
301-302 307 310 
330 333-334 336- 
357 370 373 377 
391-392 397 399 
411 417 421 424 
437 440-443 454 
476 483 488-489 
513 516 519-520 
544 547 550 561 
590-591 595 597 
623 628-629 631 
657-658 660 665 
689 691-694 696- 
710 716 720 728 
744 757-760 763 
806-807 810 817- 
858 861 864 871 
894-895 898 904 
936 938 945 950 
959 961 963 967 



63 65-66 70 
108 114 13 
168 171-172 
213 221-226 
272 279-281 
317-318 321 
338 346-347 
379-380 382 
402 406-408 
426-427 430 
460 464 467 
495 497 508 
524 530 537 
567 572-574 
604 607-609 
634 638-640 
669 674-675 
697 699 701 
732 734 736 
775-778 780 
818 826 839 
872 884 890 
915 921-923 
952 955-956 
969-971 990 



72 
6 139 
175 
237- 
295 
-324 
352 
384 
410- 
436- 
473 
510- 
-540 
582 
615 
655 
679 
706 
742- 
799 
843 
-891 
935- 
958- 
992 
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Tissue Origin I RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



fetal brain 
fetal brain 



999 1001 
1016 1022 
1035 1042 
1065 1067 
1114-1115 
1151 1153 
1172-1173 
1190-1200 
1226-1227 
1253-1255 
1270-1273 
1314 1317 
1339 1341 
1371 1373 
1386 1392 
1425-1426 
1440-1441 
1502-1503 
1519 1536 
1559 1573 
1611-1614 
1640 1651 
1693 1696 
1718 1720 
1730-1733 
1742 1745 
1767 1771- 
1786 



1005- 
1024 
1047 
1070 
1119 
•1156 
1178 
1211 
1229 
1258 
1281 
1320 
1344 
1376 
1396 
1428 
1448 
1507 
1544 
1589 
1619 
1657 
1703 
1722 
1735 
1755 
1772 



1006 

1029 
-1048 
1082 
1131 
1160 
1184 
1216 
1231 
1260 
1287 
1326 
1350 
1379 
-1398 
-1429 
1466 
1511 
1549- 
-1590 
1621 
-1658 
1704 
1724 
1736 
1759- 
1777 



1008 
1030 
1052 
1089 
1143 
1163 
1186 
1222 
1236 
1262 
1308 
1334 
1356 
1381 
1419 
1432 
1470 
1513 
1550 
1598 
1625 
1676 
1713 
1726 
1738 
1761 
1779 



1013 
1032 
1056 
1109 
1149 
1167 
1188 
1223 
1245 
1266 
1309 
-1335 

1369- 
-1382 
1423 
1437 
1482 
1516 
1557- 
1608 
-1626 
1679 
1714 
1728 
1739 
1765 
1780 



Clontech 



Invitrogen 



FBRs03 
FBT002 



235-236 520 864 10^8 11B8 1587 



retal heart 



fetal kidney 



Invitrogen 



PHR001 



Clontech 



FKD001 



15-18 20-21 24-25 29 34 43 61-63 
77-78 98 101 103 107-108 128 130 
136 146 148 165-166 171 174 181 
185 196-198 204-205 208 223 230 
235-236 251 253 261 268-269 280- 
281 284-285 288 309-311 321 329 
334 339 346-347 350 357-359 381- 
383 390 407 418-419 430 434-435 
438 443-444 461 464-466 483 490 
494 509 516 519 522 527 557 561- 
562 572-573 590-591 595 597 623 
632 647-648 650 655 669-670 672 
682 690-691 700-701 710 717 736 
746 782 784 788-789 814-815 825 
829 840-841 847 854-855 857-858 
897-900 904 919 925 935-937 946 
948-949 954 960-962 966 969-970 
986 996 1000-1001 1005-1007 1012 
1014 1022-1028 1045 1052 1055 
1068 1070 1072 1078 1082 1085 
1090 1109 1115 1118 11201128 
1136-1137 1144-1145 1149 1156- 
1157 1193-1195 1198 1204-1205 
1220 1222 1234 1257 1262 1271 
1274-1275 12B0 1285-1286 1294 
1312 1314 1317-1320 1330 1342 
1344-1345 1349-1350 1355-1356 
1358 1364 1369 1379 1383-1384 
1431 1435 1476 1507 1519 1532 
1536 1547 1554 1564 1567 1578 
1582 1587 1593 1595 1601 1608 
1615 1619-1621 1638 1644 1661 
1665-1666 1673 1687-1688 1690 
1715 1723 1728 1749 1753 1757 
1759-1761 1765 1771 1774 1776 
1778 1781-1782 1786 



10S 124 180 289 864 1036 1148 
1229 1614 1616 1762 1785 



5-8 11 40 47 57 65-66 82 85 102 
124 163 171 216 222 224 235-236 
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Tissue Origin 


RNA Source 


Hyseq 






SEQ 


ID NOS: 










Library Name 






















258 


277 


280-281 


307 


310 


314 


330 








371 


387 


392 395 


403 


422 


-423 


** O J. 








436 


443 


455 46S 


500 


519 


522 


542 








563 


572 


-573 585 


600 


619 


623 


650 








654 


657 


-65B 660 


679 


719 


731 


780 








798 


821 


833 844 


854 


-855 


857 


864 








868 


878 


911 929 


958 


960 


969 


990 








992 


1007 1046 1087 


1103 


112 


9 








113 


9 12 


85 1312 


1331 


1355 13 


69 








1371 13 


76 1391 


1422 


1425-14 


26 








144 


0-14 


41 1470 


1543 


1598 16 


01 








161 


8 16 


31 1651 


1654 


-1655 16 


69 








167 


8-16 


79 1691- 


1692 


1733 1785 


fetal kidney 


Clontech 


FKD002 


352 


384 


426-427 


440 


583 


602 


1060 








113 


1 13 


24-1325 


1636 








fetal kidney 


Invitrogen 


FKD007 


20- 


21 82 163 33 


5 679 988-989 








1000 1227 1230 


1320 


1554 




fetal lung 


Clontech 


FLGQ01 


35- 


36 94 323 371 39 


8 426-42 


7 445 








473 


549 


560 604 


616 


-617 


626 


631 








649 


651 


719 746 


786 


-787 


832 


842 








849 


-850 


B64 894 


-895 


1075 1178 








1182 12 


00 1206 


1309 


1311 13 


45 








1429 14 


93 1567 


1576 


1620 16 


86 


fetal lung 


Invitrogen 


FLG003 


9 15-16 


29 41 4 


7 68 


-69 


B3 B 


B-89 








102 


124 


137 152 


-153 


165 


196 


224 








229 


231 


249 254 


256 


267 


291 


-292 








300 


325 


333 344 


-345 


352 


373 


376 








379 


384 


408 426 


-427 


430 


432 


467- 








468 


475 


463 486 


493 


516 


531 


535 








545 


547 


549 564 


582 


602 


623 


644 








660 


662 


-664 670 


673 


725 


-726 


728 








761 


766 


-767 774 


805 


830 


852 


-853 








864 


875 


921 932 


93 7 


946 


949 


963 








988-989 


1014 1016-1017 1024 


1027 








1090 1097 1170 


1185 


1200 1215- 








1216 1224 1258 


1290 


1309 1320 








1342 1347 1355 


1369 


1381 1413- 








1414 1431 1438 


1449 


1491 1512 








1536 1547 1557- 


1560 


1567 1590 








1601 1636 1644 


1653- 


•1655 1662 








1667 1671 1675 


1680- 


•1681 1706 








1739 1760-1761 


L769 








fetal lung 


Clontech 


FLG004 


103 


276 


334 465 


-466 


73 7 


843 


1131 








1614 1658 










fetal liver- 


Columbia 


FLS001 


3-11 13 


15-21 25 30- 


39 41-48 50- 


spleen 


University 




51 54 56-58 60-66 68-69 


72 75 








77-80 82-83 85 


37 89 92- 


•103 


105- 








110 


112 


116-124 


126- 


127 


130 


133 








135- 


-139 


141 144 


147- 


149 


152- 


153 








157 


163- 


165 167 


-172 


174 


176- 


178 








180 


186 


188-190 


193- 


194 


196 


198- 








200 


202- 


206 210 


-214 


219 


221- 


231 








233- 


236 


240-244 


246- 


247 


250- 


251 








255- 


256 


258 261- 


-265 


268- 


269 


272 








274 


276- 


278 280- 


-281 


284- 


286 


288 








293 


295 


299-301 


304 


306- 


307 


309 








311 


314 


316 318 


320- 


321 


326 


329- 








332 


342 


344-345 


350 


352- 


353 


356- 








358 


360 


362 370- 


374 


376 


378- 


384 








386- 


387 


390 392- 


393 


400- 


401 


403 








406 


408 


410-412 


415 


417 


419 


422- 








437 


439- 


442 444- 


445 


448 


452- 


454 








456 


459 


461-470 


472- 


479 


481- 


483 








487- 


488 


490-491 


493 


500- 


501 


503- 








506 


509- 


513 515- 


520 


522- 


524 


526- 








529 


531 


534 536- 


540 


542 


547- 


549 








553- 


554 


561-562 


564 


567- 


568 


571- 








576 


579 


581 583 


585- 


597 


599- 


605 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID KfOS: 



fetal liver- 
spleen 



507 610-613 615-621 
628-634 636-640 644 
660 665 669-670 672 
681-682 684 690-695 
710 713-714 716-719 
731 734 736 73B 740 
748 750-751 759-766 
777 779 783-788 793 
805 808 810-812 814 
824 826-832 834-837 
867 869-876 878-883 
897-898 902 904-914 
928 930-937 939 945 
960-961 963-965 967 
978 980-983 986 988 
995-997 1000-1002 1 
1014 1016-1019 1025 
1031 1033 1035-1036 
1047 1049-1050 1053 
1059 1061-1064 1067 
1074 1076 1078 1082 
1089-1090 1097 1099 
1113 1115-1119 1121 
1127-1128 1131-1134 
1144-1150 1153 1159 
1170 1175 1177-1178 
1192 1195-1200 1202 
1211 1214 1216 1218 
1225 1227 1234 1237 
1246-1247 1251 1254 
1266 1268 1270-1273 
1284-1285 1287-1290 
1300 1306-1308 1313 
1325 1327 1330 1332 
1341 1343 1345-1347 
1353-1360 1362-1363 
1369-1370 1372-1374 
1381 1383-1384 1386 
1400 1402-1403 1405 
1415 1417-1419 1422 
1435-1437 1439-1442 
1448-1449 1454 1458 
1470 1472 1474 1477 
1482 1485 1491-1493 
1501-1507 1509 1511 
1519 1524-1526 1529 
1541 1546-1547 1549 
1554 1562 1564 1569 
1575 1578 1581 1583 
1591-1592 1594-1595 
1600-1604 1611-1612 
1617-1618 1620-1622 
1627-1628 1630-1632 
1645-1651 1653-1662 
1669 1671 1673-1674 
1690 1696 1701-1703 
1711 1713-1714 1718 
1724-1727 1731-1733 
1741 1743-1744 1746 
1752 1754 1760-1765 
1780 1783-1786 



623-624 626 
647-650 655- 
674-675 678 
697 702 708- 
725-728 730- 
-741 743-746 
768 772 7<74- 
796 798 800- 
818-819 821- 
843-847 849- 
887 889-895 
916 919 921- 
-950 953-958 

969 971 974- 
-990 992-993 
004-1008 1012 
1026 1028- 
1039-1044 
-1056 1058- 
-1070 1072- 

1085-1087 
-1103 1107- 
1123 1125 
1136-1137 
1160 1163 
1188 1190- 
1206 1208- 
1221-1222 
1241 1244 
1258 1261 
1277-12B2 
1294 1239- 
-1320 1324- 
-1333 1338 
1349-1350 
1365-1367 
1376 1378- 
1389-1391 
1410 1413 
1429 1431 
1445-1446 
-1459 1466- 
-1478 1480 
1496-1498 
-1512 1516- 
1532 1536- 
1550 1552- 
1572 1574- 
1587-1588 
1597-159B 
1614-1615 
1624-1625 
1634-1639 
1664 1667- 
1676-1688 
1706-1709 
1719 1722 
1738 1740- 
1748 1751- 
1767-1773 



Columbia 
University 



FLS002 



3-11 13 15-21 26 29 32 35-39 42 
44-45 48 50-51 54-55 57-58 61 64 
6B-69 73-75 78 80 82 84 87 95-98 
100 103 105 107-108 110 112-113 
116-119 122-125 128 130 137-138 
145 147-153 155 157 159 161-163 
156 168 171-172 174-175 177 181 
188-189 193-194 196-198 200-203 
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RNA Source 



SEQ ID NOS: 



Tissue Origin 



Hyseq 
Library Name 



206 212-215 219-221 223 225-229 
231-232 240-244 246-247 250-251 
258-259 262 264 268-269 272 275 
277 280-281 284 286 288 290-292 
295 298-299 301-304 306 308-310 
318 320-321 323 325 329 331 334 
342 348-349 352-353 356 359 368 
371 374 376-379 381-384 386-387 
392-393 397-398 400-401 403 410- 
413 421 423 426-427 429-430 433- 
436 438 440 443 445 448 451-452 
454-455 460-463 465-467 469 471- 
473 475-476 478-479 481-483 487 
490-491 493-494 497 500-501 SOS- 
SOS 509-513 515-517 519-520 524 
526-531 534 537-542 544 547 552- 
554 556 558 561-562 564-567 571- 
577 583-587 590-591 593 595 597 
601 604-606 608-613 616-617 619- 
624 626-632 634 637-642 644 647 
649-652 654-659 662-665 669-672 
674-675 681-682 685 688 690 696 
698 700-703 707 709-710 713 717 
719-721 723-724 728 731-732 734 
737-738 742-745 748 752 754 759 
763-766 768 770 773-777 780 782 
784 786 791 795-798 801-802 805 
808 811-812 818 823-824 826-827 
832 834-837 839 843 846 848-856 
858-861 865 867 869 871 873-874 
876 878 881-882 887 889 892 894- 
898 901-902 904 906-908 913-915 
919 921-924 926-932 934-935 937 
939-941 943 946-947 950 953 958 
961 965-967 971 973-975 977-979 
981 984-985 990 992-993 995-997 
999 1001 1004-1007 1009-1011 
1013 1016 1020 1023 1025 1027- 
1031 1033-1035 1039-1042 1044- 
1045 1049 1053 1055-1056 1058- 
1059 1062 1064-1065 1067-1070 
1072-1074 1079 1082 1087 1089 
1093 1097 1099-1103 1105-1107 
1109-1114 1123 1125-1127 1132- 
1134 1140 1143-1145 1148-1150 
1156 1158 1160 1163 1172-1173 
1177-1178 1181-1184 1190-1192 
1195-1197 1199 1204 1206 1208 
1211 1214 1216 1219 1227 1230 
1234-1235 1237 1240-1241 1243 
1245 1247 1256 1258 1260-1261 
1264 1268 1270-1271 1275 1278- 
1279 1284-1286 1288-1289 1299- 
1301 1306 1308 1312 1314 1317- 
1319 1323-1325 1327-1330 1334- 
1335 1339 1343-1347 1349-1350 
1354-1355 1357 1360 1362-1363 
1365-1367 1369 1372 1376 1378- 
1380 1386 1389-1391 1394 1400 
1403 1406 1409 1416-1419 1422- 
1427 1429 1435 1437-1438 1440- 
1442 1446 1448-1450 1453 1460- 
1461 1468 1470 1472 1474-1475 
1478 1482 1486 1490-1493 1496 
1498 1500-1504 1506 1508-1509 
1511-1512 1516 1518-1519 1521 
1524-1528 1531 1536-1538 1543 
1547 1550 1554 1556 1564 1567- 
1569 1580 1587-1588 1591-1592 
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Tissue Origin 



fetal liver- 
spleen 



RNA Source 



Columbia 
University 



Hyseq 
Library Name 



SEQ ID NOS: 



1597 


-1598 


1600- 


•1601 


1611- 


1612 


1618 


-1628 


1630- 


-1631 


1635- 


1638 


1641 


1646- 


-1649 


1652 


1654- 


1659 


1661 


-1662 


1664 


1667 


-1669 


1674 


1676 


-1679 


1683- 


1684 


1686- 


1688 


1691 


-1692 


1699 


1702 


1707 


1711 


1713 


-1714 


1717 


1719 


1722 


1726- 


1727 


1730- 


-1733 


1738 


1740 


1743- 


1744 


1748- 


•1752 


1758 


1760- 


1761 


1763 


-1764 


1767 


1769 


1772- 


1773 


1776 


1779 


1783- 


1786 







FLS003 



fetal liver 



103 300 318 321 352 372 379 381 
384 392-393 403 422 424 429 434- 
435 440 444 453 503 515 544 592 
978 1064 1324-1325 1327 1333 
1357 1369 1378 1418 1424 1622 
1646 1649 1680-1681 1689-1690 
1717 1743-1744 1769 



Invitrogen 



FLV001 



fetal .liver- 
fetal liver 



15-16 26 34 58 61 64 70 75 7B 89 
98 105 112 116 120-121 123 133 
151 165 176 180 194-196 198 200 
204-206 210-211 220 225-226 230 
235-236 239 247 2S9 261 267 272 
277 280-281 303 310 313 317 320- 
321 329 344 356 371 374 376 379- 
382 395 408 412 414 419 429 434- 
435 441-442 465-466 490 494 504- 
506 509 522 527 534 552-553 562 
567 569-570 572-574 607 631 657- 
658 667 669 672 685-686 702 717 
725-726 732 748 759 761 778 784 
786 809 817 829 837 857 861 872- 
873 875 881 889 894-895 909 911 
916 954 963 967 974 977 986 988- 
989 993 995 997 1000 1005-1006 
1008 1014-1015 1020 1042-1043 
1070 10B6-1087 1089-1090 1118- 
1119 1122 1144-1145 1148 1153 
1157 1159 1183 1195-1196 1227 
1250 1257-1258 1262 1267 1280 
1285 1307 1312 1314 1317-1320 
1344-1345 1349-1350 1355 13.62- 
1363 1403 1405 1415 1419 1425- 
1426 1429 1431 1442 1448 1463- 
1464 1469-1470 1489 1528 1536 
1539 1549-1550 1557-1562 1577 
1583 1598 1601 1611 1615 1622 
1644 1649 1666 1674 1706 1721 
1738 1746 1763-1765 1774 1776 
1779 



Clontech 



FLV002 



676 998 1719 



Clontech 



FLV0 04 



93 133 214 301 355 374 379 555 
581 601 679 837 847 859 1123 
1236 1270 1313 1324-1325 1327 
1355 1367 1425-1426 1536 1690 
1733 1760-1761 



fetal muscle 



Invitrogen 



FMS001 



26 37-39 50-51 58 84 86 89 98 
113 128 131-132 139 155 172 186 
194 198 201 206 211 230-231 256 
261 276 282 286 302 325 359 361 
376 379 383 398 412-413 419 430 
436 448 452 462-463 473 477 503 
519 529 561 569-570 590-591 597 
607 623 626 635 647 660 672 715 
725-726 730 733 761 775-777 788 
826 837 860 874 913 915 921 935 
970 980 986 988-990 992 1000- 
1001 1007 1014 1027 1035-1036 
1045 1060 1064 1070 1083 1097 
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Tissue Origan 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



tetal muscle 



1099- 

1173 

1266 

1324- 

1383- 

1433 

1557- 

1632 

1712 

1766 



1102 
1198 
1270 
1325 
1384 
1505 
1559 
1644 
1725- 



1116- 

1208 

1277 

1329 

1399- 

1514 

1562 

1650 

1726 



1117 

1228 

1298 

1336- 

1400 

1542 

1589 

1652 

1743- 



1121 

1240 

1317- 

1337 

1403 

1551 

1599 

1671 

1744 



1164 
1258 
1320 
1369 
1409 
1554 
1620 
1675 
1754 



Invitrogen 



FMS002 



fetal skin 



119 221 273 402 426-427 463 547 
599 736 869 1000 1033 1083 1266 
1431 1440-1441 1468 1545 1599 
1673 1678-1679 1687-1688 1710 
1712-1714 1723 1725 1731-1733 
1743-1744 1760-1761 1767 



Invitrogen 



FSK001 



1 4-11 15-16 20-23 25 29 33 40 
43 46 56-57 60-61 64-66 75 82 87 
97-98 105 107-108 113 118-119 
123 133 135-137 139 144 146 148 
151-153 156 163 170 176 180 188- 
189 197-198 200 202-203 210 218 
222 231 246-247 261 263 265-270 
277 285-286 290 293 299 301 307 
311 321 325 328 330 333-335 339 
341 345 351-352 355-356 358-359 
362 368 370 372 376 379-382 384 
388 394 404-405 408-409 411-412 
419-420 424 426-427 436 441-442 
445 448-449 454 462 465-466 472 
476 490 493 504 506 509 515-517 
519 526 531 537-540 547 549 560- 
561 567 572-573 581 584 589 611- 
612 615 623 630-631 635 647 649 
651 657-656 660 662-665 667 669 
672 676 678 681 688 701 704-705 
709-710 713 717 720-721 725-726 
728-729 732 748 750 753 759 764 
766 770 775-777 780-781 786 788- 
789 798 B09 811 814 816-817 822 
824-826 831 842 857 859 861 863- 
864 881 894-895 908 910-911 916 
918 922-923 928 932-933 935 937 
946 948-949 953 960-961 966-967 
970 975 977 986 990 992-993 999- 
1000 1004 1007 1013 1018 1025 
1027 1032 1035 1041-1043 1054 
1057-1058 1060 1062-1064 1069 
1072 1077 1090-1091 1097 1099- 
1103 1108 1113 1119 1123 1128 
1131 1134 1140 1148-1149 1152- 
1153 1156 1163 1167 1178 1182 
1189 1192 1195-1196 1198 1201- 
1205 1208 1211-1212 1216 1219- 
1220 1222 1225 1240 1243 1258 
1266-1267 1274 1277 1280 1282- 
1285 1299 1310 1317-1322 1324- 
1325 1329-1330 1342 1344 1346 
1349-1351 1354-1357 1365-1366 
1369 1371 1373 1376 1378 1380 
1383-1384 1387 1399-1400 1405 
1410 1427 1429 1431 1433-1435 
1439-1441 1448-1449 1454 1457 
1468 1470 1472 1475 1480-1481 
1487 1490-1491 1493 1498 1509 
1512 1521 1525-1526 1529 1535- 
1536 1547 1549 1557-1559 1588 
1592 1595 1597-1598 1601 1603- 
1604 1608 1611 1614 1618 1624- 
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Tissue Origin 



fetal skin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



1626 1632 
1644 1646 
1665 1668 
1702-1703 
1724 1727 
1742 1747 
1765 1772 
1786 



1634 
1654- 
1675 
1709- 
1731- 
1749 
1776- 



1636 
1657 
1685 
1710 
1732 
1755 
1777 



1641 

1660- 

1687- 

1716 

1737- 

1760- 

1779- 



1643- 

1662 

1689 

1719 

1740 

1761 

1780 



Invitrogen 



FSK002 



fetal spleen 
umbilical cord 



13 286 302 307 
339 341 354 370 
408 414 426-427 
515 544 585 598 
1076 1109 1155 
1333-1335 1343 
137i 1377-1378 
1466 1647 1656 
1688 1693 1718 
1732 1739 1755 



313 321 330 335 
372 385 400 402 
433 4.36 450 454 
767 810 845 939 
1317-1320 1326 
1347 1350 1369- 
1391 1397 1422 
1678-1679 1687- 
1721 1725 1731- 



BioChain 
BioChain 



FSP001 



110 137 211 353 589 927 1108 
1639 1771 



FUC001 



fetal brai 



GIB CO 



HFB001 



4-8 10 12 14 17 33-36 44-46 57 
64 68-69 75 82 85 101 104 113- 
114 116 119 122-124 133 137 153- 
154 157 161 163 166-167 175 181- 
184 186 192 197-198 200-202 212- 
215 230 234 246-247 251 256 263 
267 271-272 280-281 284 295 301 
314 317 321 326 333-335 345 351 
356 368 371-373 379-380 386 390 
392 394 406 408-410 412 414 416 
420 424 427 430-436 438 444-446 
454 459 461 463 467 473 482-483 
486 488 490 495 504 509 524 526 
537-540 547 555 561 574-577 588- 
591 593 606 615 620-621 632 637 
645-647 650 659-660 662-664 667- 
668 674-675 684 687 696 698 701 
703-705 709 711 714 719-720 725- 
727 732 749-750 762 765 771 775- 
777 780 789-791 793 796 802-803 
814-817 822 833 843 845 B48 858 
861 864 875 879 888 894-895 897- 
900 903 906-907 911-912 925 930- 
933 936 940 948 953 960 966 977 
984 990 992 998 1000-1001 1005- 
1007 1016 1023 1025 1037 1046- 
1047 1059 1061-1063 1073 1076- 
1077 1089 1094-1097 1112-1113 
1115 1134 1144-1148 1151 1154 
1156 1163 1171 1197 1204-1205 
1208 1216 1218 1224 1234-1235 
1243-1244 1246 1279 1283 1286- 
1287 1298 1316 1320 1344 1346 
1350 1357 1359 1371 1373 1375 
1381 1398 1400 1403 1408 1414 
1424 1427-1428 1431 1433 1440- 
1442 1446 1454-1455 1479 1482 
1484-1485 1489 1492-1493 1504- 
1505 1513 1525 1527 1536 1538 
1546 1565 1567 1571 1573 1575- 
1576 157B-1579 1591 1595 1600- 
1601 1608 1612 1615 1621 1624 
1626 1636-1637 1647-1648 1651 
1653 1656 1658 1661-1662 1672 
1675 1682 1684 1686-1688 1690 
1709-1710 1722 1727 1729 1735- 
1738 1740-1741 1760-1761 1768 



4 9 11-13 17-18 22-23 25 37-39 
42-47 50-51 54-55 58 60-61 65-66 
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Tissue Origin 


RNA Source 


Hyseq 
Library Name 






SEQ 


ID NOS: 










12 75 77 80 


82 


85 9 


0-91 


- 

94 100- 








102 


107 


110 


112 


-116 


118- 










123 


126 


128 


134 


136 


-140 


1 A n 1 AO 








153 - 


155 


157 


161 


165 


169- 


172 175 








181 


186 


188 


-189 


197 


-198 


o r\A nr\c 








208 


210 


215 


222 


-223 


225- 


J. £. a «s-5 U 








23 5- 


238 


24 0 


-241 


247 


253 


256-258 








260 - 


262 


267 


-269 


276 


279- 


1(31 OOA 
4D1 BO? 








na c 


289 


298 


300 


-302 


307 


310 318 








321- 


323 


325 


3 30 


-331 


339 


"3 41 "1/ c _ 








34 9 


352 


354 


356 


-359 


362 










371- 


372 


377 


379 


-380 


382 


JOI JO/ 








390 


400 


408 


414 


-416 


419 


a? a /m 








4 34- 


4 3 5 


438 


441 


-443 


449 










455 


4 57- 


-463 


470 


472 


-473 


47c Ann _ 








478 


482- 


-483 


486 


-488 


490- 


A Q1 A Ql 

*iyx *x y J 








496 


499- 


•500 


502 


-504 


506- 










512 


516 


519- 


-520 


522 


525- 










530 


537- 


■540 


543 


-544 


546- 


547 566- 








567 


569- 


-570 


572 


-582 


585 


C O D r n n 








591 


593 


595 


599 


601 


604 


0 v b - ova 








611- 


612 


614- 


•620 


622 


-624 










636 


643 


645- 


•647 


650 


-652 


CCA fro 








661 


665 


667- 


-66B 


670 


-672 


enc cna. 








681 


687 


689 


692 


-694 


697 


/rnn Tin 

oU? fXV 








714 


717 


721 


727 


729 


-732 


734 736 








738 


743- 


746 


750 


-751 


759 


763 766 








770 


772 


775-777 


784 


789 


791 796 








799 


802-805 


810 


-811 


814 


819-821 








824 


826 


830 


834 


-837 


839- 


850 854- 








856 


858- 


860 


862 


864 


869 


871 876- 








877 


879 


883 


886- 


-887 


890- 


891 893- 








895 


898- 


901 


905 


908 


-910 


912 - 916 








919 


922- 


923 


925 


92 7 


930- 


933 935- 








938 


948 


952- 


960 


963-964 


ft 1 ft /- r\ 

967 969- 








972 


975 


978- 


979 


981 


983 


a a e nan 

986-987 








990 


992 


995 


997 


999- 


-1002 


1 005 - 








1009 


1011-1013 1016 


1018 


-1019 








1023 


1026 1029-1031 


1033 


-1035 








1038 


1041 1047 1050 


1053 


iyp / 








1059 


1064 1068 1070 


1072 


t n n"> 
-XV M 








1078 


-1079 1C 


81-1082 


1086 


1089 








1094 


1097 1103 1107- 


-1109 


1113 - 








1115 


1121-1122 1127 


1134 


-XX Jb 








113 8 


1140 1143 1148- 


-1151 


X X DO 








1156 


-1157 1159 1167 


1170 


xx 1 a 








1193 


-1194 12 


00 1202 


1207 


— J.Z U3 








1211 


1216 1219-1220 


1226 


.1997 








1229 


1232-1234 1240- 


1241 


1 n a 1 








1246 


1249-1251 1253-1254 


* 1258 








1267 


-1268 1271 1276 


1279 


1282 








1285 


-1289 12 


93-1294 


1305 


1307- 








1308 


1312 1316 1320 


1327 


133 8 - 








1339 


1341-1344 1346 


1349 


1355- 








1357 


1359 1365-1366 


1369 


-1370 








1373 


-1375 1379 1386 


1389 


1394 








1 1QQ 

1 J a 


1409 1413-1414 


1416 


-1417 








1420 


-142 


1 1425-1427 


1430 


1433 








1437 


143 


9 1442 1445- 


1452 


1454- 








1457 


1459 1463-1464 


1468 


14 70 








1474 


1477-1479 1489 


1492 


1494 








1497 


-149 


8 1501-1503 


1507 


1509 








1511 


-1513 1517 1 


520- 


1521 


1524- 








1526 


153 


1-1533 1 


535 


1537 


-1538 








1547 


1554 1556-1 


559 


1564 


-1567 








1571 


1584 1587 1589 


1594 


1599- 








1601 


1611-1612 1 


614- 


1616 


1619- 








1620 


1625-1628 1630- 


1631 


1634 








1637 


-1638 1640-1643 


1645 


1648- 
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Tissue Origin 


RNA Source 


Hyseq 








SEQ 


ID NOS: 








Library Name 






















164 


9 1651 1653- 


1655 


1657-1658 








1664-1665 1667 


1669 


1673 1678- 








1679 1683-1684 


1686 


1693 1701 








1704-1705 1709 


1713 


-1714 1717- 








172 


0 1724 1727- 


1728 


1731-1733 








1737-1738 1743- 


1744 


1752 1754- 








1755 1757 1760- 


1761 


1765 1772 








1779 1785 










macrophage 


Invitrogen 


HMP001 


5-8 


110 


204 


-205 


503 


£34 


678 85$ 








878 


933 


988 


-989 


1379 1448 1504 


infant brain 


Columbia 


IB2002 


10 12-13 15 


-18 


22-23 25 


29 34 




University 




37-39 43 47 


50- 


51 54-56 


58 60-63 








65-66 6 


8-69 


72- 


74 80 82 


-83 86 








88-92 97 100 102-104 106-108 110 








112- 


-113 


115 


-116 


118 


123 


128 130 








134 


-136 


138 


-139 


143 


147-149 151- 








152 


154 


-155 


163 


165 


-167 


169 172- 








175 


181 


-184 


186 


193 


-196 


198 201 








203- 


-205 


209 


-210 


214 


-215 


222 224- 








226 


231 


-232 


235 


-236 


239 


246-247 








252 


257 


260 


268 


-269 


272 


276-277 








279-281 


286 


288 


291 


-292 


295 298 








300- 


■301 


304 


307 


310 


313 


321-323 








330-331 


333 


-334 


339 


346- 


-347 349 








352 


356 


-357 


362 


371 


-372 


377 379- 








380 


383 


-384 


392 


397 


401 


406 408 








411 


413-414 


416 


418 


-419 


422 428 








430-431 


434 


-435 


438 


443 


449 453- 








454 


461 


464-466 


469 


-470 


472-473 








475- 


476 


478 


482 


-483 


487 


490 492 








494 


497 


503 


507 


-508 


510-513 516 








519- 


520 


524-526 


530 


-534 


536-540 








547 


550- 


-551 


561 


563-564 


566-567 








572- 


576 


579 


581 


-582 


584-587 590- 








591 


593 


595- 


-597 


607-609 


611-613 








616- 


617 


620 


622 


-624 


627 


631 63 7 








641 


645- 


•647 


650 


-655 


657- 


658 660- 








665 


667-675 


689 


691 


695 


697 699 








703 


707 


713- 


•715 


717 


721 


728-731 








733- 


736 


739 


743 


745 


751 


755 759 








763 


769- 


770 


772 


778 


780- 


781 785 








78B- 


789 


793- 


794 


799 


803 


808 811 








814 


825- 


826 


830 


834-836 


840-843 








845 


848- 


850 


854- 


-855 


860 


862 864- 








865 


870 


872 


875-876 


878 


886 888 








890- 


891 


894- 


896 


898 


903- 


904 916- 








917 


919 


922- 


925 


927- 


928 


930-932 








934- 


936 


938 


941 


945-946 


948-950 








953- 


954 


959- 


962 


966- 


969 


977 979 








981 


986- 


990 


992 


997 


999- 


1000 








1004 


-1006 1014 1016 


1018 


-1019 








1024 


-102 


5 1033 1036 


1047 


1051- 








1052 


1054-1055 1057- 


1059 


1063- 








1064 


106 


8-1070 1073 


1081 


-1082 








1085 


108 


9 1108-1113 


1118 


-1120 








1123 


-112 


4 1130 1132- 


1138 


1140 








1149 


115 


1 1153-1154 


1163 


-1170 








1172 


1174-1175 1183- 


1184 


1188 








1190 


1193-1194 1196- 


1197 


1199 








1204 


120 


8-1209 1211 


1218 


-1222 








1226 


-122 


7 1229 1231 


1234 


1241 








1247 


124 


9 1251 1256 


1258 


1261- 








1262 


1269 1274 1279 


1281 


1283 








1285 


1287-1289 1294- 


1295 


1305 








1307 


1313-13 


14 1316- 


1320 


1329 








1332 


1341-13 


42 1345 


1349 


1356 








1362 


-1363 13 


65-1366 


1368 


-13 70 








1374 


1381 13 


83-1384 


1388 


1400 








1403 


1406-14 


07 1413 


1417 


1420 
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Tissue Origin 



infant brain 



infant brain 



RNA Source 



Columbia 
University 



Hyseq 
Library Name 



IB2003 



SEQ ID NOS: 



1423 

1441 

1454 

1468 

1483 

1499 

1522- 

1542 

1555 

1580 

1593 

1610 

1624 

1639- 

1654- 

1672- 

1693- 

1717- 

1733 

1755- 

1777- 



1429 

1443 

1455 

1470- 

1485 

1502- 

1523 

1546- 

1563 

15B3- 

1595 

1612 

1626- 

1640 

1655 

1673 

1695 

1720 

1735- 

1758 

1778 



1431 

1447 

1457 

1471 

1493 

1503 

1525 

1547 

1565- 

1586 

1598 

1614- 

1627 

1642 

165B- 

1676- 

1701- 

1723- 

1741 

1762 

1786 



1435- 
■1449 
1459 
1475 
1494 
1505- 
1528 
1549- 
1567 
1588 
1600- 
1616 
1630- 
1644 
1659 
1681 
1702 
1724 
1743- 
1765 



-1436 
1451 
1463 
1479 
1496 
1507 
1531 
1550 
1569 
1590 
1601 
1619 
1633 
1647 
1664 
1685- 
1704 
1726- 
1744 
1771 



1439- 
-1452 
1465 
1482- 
1498- 
1509 
1533 
1554- 
1575 
1592- 
1608- 
1621 
1637 
1652 
1665 
1688 
1708 
1728 
1752 
1774 



17-18 20-23 29 34 43 
78-80 88 100-101 107 
123 128 133 135-137 
159 166 169 174 194 
223 225-226 229 235- 
276-281 286 290-292 
310 322 324 331 334 
349-350 352 357 371 
384 403 408-409 414- 
472 476 478-479 490 
520 530 534 536-540 
576 585 587 590-591 
601 606 612 616-617 
650 652-653 661 665 
675 678 689 715 717 
734 759 775-777 780- 
806-807 811 824 845-' 
875 882 889 894-895 
919 921-923 932 935 
954 962 977 979 997 
1005-1006 1009 1011 
1033 1037 1043 1055 
1114-1115 1120 1123 
1145 1149 1151-1153 
1170 1174 1193-1194 
1202 1206 1209 1220 
1229 1240-1241 1251 
1288-1289 1305 1314 
1344 1347 1350 1356- 
1366 1378-1379 1388 
1421 1423 1431 1436 
1446-1447 1457 1459 
1503 1507 1509 1536 
1559 1567 1572 1587 
1610-1612 1615 1631 
1647 1657-1658 1673 
1683-1684 1701-1702 
1713-1714 1719 1757 
1765 1771 1778 



60 6B-69 
110 112 118 
146 148 152 
198 203 215 
236 247 260 
295- 300-301 
339 346-347 
376-377 382 
415 453-455 
503 507 516 
551 563 572- 
593 595-596 
620 622-624 
670-671 674- 
727-728 730 
781 785 796 
846 864 869 
898 904 917 
-936 946 950 
999-1000 
1017 1024 
1057 1109 
1127 1144- 
1160 1161 
1196 1199 
■1221 1226 
1258 1284 
1327 1333 
•1357 1365- 
1400 1403 
1440-1441 
1471 1499 
1546 1557- 
1595 1598 
1639 1644 
1678-1681 
1708-1709 
1760-1761 



Columbia 
University 



IBMQ02 



infant brain 



101 113 139 152 2^0 27$ 290-292 
374 377 551 563 608-609 653 659 
814 954 1005-1006 1029-1030 1130 
1164 1209 1258 1294 1305 1320 
1327 1357 1431 1498 1507 1615 
1640 1694-1695 1763-1764 1767 
1779 



Columbia 
University 



IBS001 



10 12 119 175 279-281 321 334 
371 446 551 563 623 652 667 669 
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Tissue Origin 
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Hyseq 








SEQ 


ID NOS: 










Library Name 
























$71 


-672 


819 


949 966 


1113 1130 








115 


1 11 


88 1193- 


1194 


1196 1229 








125 


8 1265 1271 


1287 


131 


7-13 


19 








132 


4-1325 1342 


1423 


144 


0-1441 








144 


8 1471 1482 


1525 


1532 1546 








1562 1569 1 


588 


1591 


161 


0 1618 








164 


7 1649 1658 










lung, 


Strategene 


LFB001 


5-9 


17 


20-2 


1 25 


68- 


69 82 94 


105 


fibroblast 






153 


157 


197 


-198 


203 


207 


-208 


212- 








213 


223 


262 


266 


283 


302 


321 


326 








333 


356 


370 


427 


430 


436 


446 


462 








472 


493 


498 


503 


516 


519 


527 


535 








537 


-540 


542 


-544 


562 


•565 


567 


586 








599 


-600 


607 


615 


630 


647 


662 


-664 








692 


-694 


712 


719 


745 


748 


775 


-777 








794 


-796 


810 


837 


843 


-847 


849 


854- 








856 


869 


876 


903 


934 


953 


955 


-956 








964 


975 


-976 


984 


1000 1005-1007' 








102 


4-1025 1 


033 


1039 


1053 1064 








107 


0 1072 1 


082 


1112 


-1113 11 


34 








113 


6-1138 1 


140 


1195 


1223 1232- 








123 


3 1246 1 


279 


1285 


1295 13 


11 








132 


0 13 


34-1335 


1343 


1427-14 


28 








144 


S 14 


78 1 


482 


1493 


1504 1537 








155 


2 1555 1 


567 


1575 


1582 1598 








162 


0 1625 1 


632 


1638 


1645 16 


54- 








1655 1662 1 


680- 


1681 


1684 1686 








1690 16 


96 1 


702 


1711 


1733 1741 








176 


3-1761 1778 


1785 








lung tumor 


Invitrogen 


LGT002 


5-1 


3 18 


20-21 2 


9 33 


-36 40 43 52 








54-55 6 


L 65 


-66 


68-70 73 


-75 


30 85 








88-89 93-94 


100 


103 


106 


-108 


112- 








113 


115 


-116 


118 


-119 


123 


-124 


126 








130-132 


135 


-137 


139 


-141 


143 


-144 








147-148 


151- 


-153 


155- 


-156 


159 


161 








164 


169 


171 


179 


-180 


185 


190 


192 








194 


196-199 


203 


-208 


210 


212 


-214 ! 








216- 


-217 


219 


222 


233 


240- 


-241 


244 








246 


251- 


-252 


255 


-256 


261- 


-262 


266. 








272 


276- 


-277 


279 


-281 


284 


286 


288 








290 


295 


298 


301 


-302 


309- 


•312 


317 








321 


329 


332 


341 


-342 


344-345 


348 








352 


358- 


-360 


363 


368 


370- 


•371 


376 








380- 


381 


384 


389 


-390 


398 


400 


409 








414 


423 


426- 


■427 


430 


432- 


•436 


443- 








444 


450-451 


454 


462 


468 


472- 


477 








480- 


483 


487- 


466 


490- 


491 


4 93 


496- 








498 


500 


503- 


506 


509- 


512 


515-516 








519 


521- 


523 


526 


530 


534 


541 


544 








547 


554 


557 


564 


566- 


567 


572- 


576 








585- 


586 


5B8- 


589 


595- 


596 


601 


607 








611- 


612 


615 


619 


621 


623 


626 


630 








632- 


633 


644 


647 


649 


651 


655- 


656 








660 


662- 


665 


667 


669 


672 


683- 


684 








696 


700 


706 


710 


713 


716 


718- 


719 








722- 


723 


728 


734- 


739 


743 


750 


752 








763 


765- 


766 


773- 


778 


784- 


785 


787- 








789 


791 


800 


802-803 


809- 


812 


814 








824 


826 


828- 


829 


832 


838- 


839 


841- 








845 


849- 


850 


852- 


855 


857- 


861 


864 








866 


874 


878- 


880 


882 


887 


890- 


891 








897- 


898 


902 


904 


906- 


907 


910 


916 








918- 


920 


922 


924- 


925 


927 


930- 


932 








934- 


935 


937 


947 


950 


953 


955- 


956 








961 


963 


966- 


967 


969 


971 


977- 


979 








981 


984 


986- 


987 


990 


992- 


993 


995 








997 


999- 


1001 


100 


5-10 


07 1009 










1012 


-1013 10 


18 1 


020 


1022 


-1024 








1026 


1029-1030 1 


033 


103 8 


1041 
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SKQ 


ID NOS: 










1045 


1047 


-1050 


1052 


1054 


-1055 








1059 


1063 


-1064 


1067 


- 1071 


1073- 








1074 


1078 


1085 


1087 


1069 


1095- 








1037 


1104 


1106- 


-1107 


1109 


1112 








1116 


-1117 


1119 


1126 


1134 


-1135 








1139 


11*41 


-1142 


1144 


-1145 


1148 








1152 


-1153 


1156- 


-1158 


1167 


1170 








1172 


1178 


1195- 


-1196 


1198 


-1200 








1202 


1204 


1208 


1214 


1216 


1219 








1222 


1227 


1234 


1241 


1247 


1252 








1257 


-1258 


1265 


1267 


- 1270 


1276 








1278 


1280 


-1281 


1283 


X C OS 


1288- 








1289 


1295 


1300 


1305 


i inn 


1312 








1317 


-1321 


1329 


1338 




1341 








1344 


-1345 


1349- 


•1351 


1 j jj 


-1355 








1357 


1365 


-1366 


1369 


1J /o 


-1379 








1383 


-1385 


1394 


1397 


1400 


1402- 








1403 


1408 


1417 


1419 


1423 


-1426 








1431 


1433 


-1436 


1438 


1444 


1446- 








1448 


1454 


-1455 


1460 


1466 


1468 








1470 


1474 


1480-1481 


1483 


1486- 








1488 


1490 


-1491 


1494 


-1496 


1506 








1508 


-1509 


1511- 


1512 


1515 


-1516 








1519 


1523 


-1524 


1528 


-1529 


1536- 








1540 


1546 


1549-1550 


1555 


1560- 








1561 


1565 


1567 


1569 


1575 


1588 








1591 


1593 


-1594 


1596 


-1598 


1600- 








1602 


1608 


1614- 


1616 


1618 


1620 








1624 


-1625 


1627- 


1632 


1636 


1639 








1644 


-1645 


1647- 


1649 


1652-1653 








1656-1662 


1664 


1666-1667 


1670- 








1671 


1673-1675 


1678 


-1679 


1683 








1685 


-1688 


1690- 


1692 


1696-1699 








1705 


1709 


1716- 


1717 


1722 


1727 








1730 


1735 


1739 


1741 


1743-1744 








1748-1749 


1753 


1760-1762 


1765 








1767 


1770-1771 


1773 


1775-1776 








1778- 


-1779 


1786 








lymphocytes 


ATCC 


LPC001 


4 11- 


-12 18 24-25 30-31 48 50-51 








56-57 68-69 80 


92 98 103 


105 110 ! 








126 137 152-153 


157 


165 172 188- 








189 197 203 210 


217-218 222-223 








225-226 229 231 


247 


251 256 264 








272 280-281 284 


300- 


•301 321 325- 








326 339 348 352 


357 


371 382 384 








390 400 404 412 


414 


421 423 426- 








427 430-431 445 


447-448 451 454- 








455 475 503 516 


526-527 530 537- 








540 549 556-560 


563 


574 577 589 








602 613 615-617 


621 


623 628-630 








636-637 647 649 


657- 


659 690 697 








717 723 755 764 


775- 


777 780 786 








789-790 793 800 


802 


822 838 849 








866 869 876 881 


-883 


892 898 906- 








907 911 92 


1-923 


928 


975 990 992 








996 1001 1004-1007 1033 1050 








1054 


1078 


1107 


1135 


1140- 


1141 








1143 


1148 


1158 


1163 


1177 


1199 








1205 


1216 


1226 


1231 


1236 


1241 








1244 


1250 


1258 


1250 


1265 


1269- 








1271 


1290- 


1293 


130B 


1312 


1317 








1319- 


1320 


1339 


1345- 


1346 


1348 








1350- 


1351 


1357 


1367 


1369 


1379 








1381 


1383- 


1384 


1386- 


1387 


1389 








1394 


1397 


1405 


1423 


1425- 


1428 








1431 


1437 


1446 1448 


1461 


1466 








1470 


14 72 


1474 1482 


1492 


1506 








1528 


1537 


1546 


1549 


1591 


1598 








1600 


1603- 


1604 1606 


1627 


1636 
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SEQ ID NOS: 



1638 1647-1649 1651 1658-1659 

1664 1676-1677 1680-1681 1687- 

1688 1699 1711 1715-1716 1726 

1728 1737 1740 1746 1748 1752 

1756 1758 1777 1779 



leukocyte 



GIBCO 



LUC001 



604 

627- 

655 



3-4 10-11 13 15-18 20-21 24-25 
30-31 35-36 40 43-45 48 50-51 
54-58 60-63 68-69 75 79-80 82-83 
85 88-91 93-96 98 100 103-104 
107-108 112 116 119 123 125-128 
134-140 142 147-149 151 153 155 
157 162-163 167 169-172 174 177- 
179 186 190 192-199 203-207 210 
212-215 217-219 222-223 229 235- 
236 247 251 255-258 260 262 272 
274-277 280-281 285-286 297-301 
307-310 313-314 316-317 321 325- 
330 333-334 340-342 348-349 352 
354-358 370-371 380-3B5 387-388 
400 405 408-410 412 414-416 421- 
425 430-431 434-435 437 439 441- 
442 445-451 453-454 456 459 461- 
464 468-472 474-479 481 483-485 
487-491 496 499-501 503-504 509- 
513 516-519 522 526-527 529-531 
534 536-540 542 547-549 553-559 
566-567 571 574-577 579 582 584- 
586 589 593 595-597 601-602 
606-607 611-613 615-621 623 
629 633 636-637 642 644-650 
659-660 662-665 667 669 674-675 
678 682-684 692-696 698 700 706 
708 710 716-720 725-726 729-736 
738-739 743-746 749 751 753 756 
759 765-766 768 770-778 780 784- 
786 788-790 793 796 798 800 802- 
803 810-811 814 817 819 826 828- 
830 832 834-836 838 843 845-860 
863-864 866-871 877-879 881-892 
894-896 898 902 904-914 916 919- 
925 927 930-932 935-936 941-942 
945 948-949 953 955-956 958 960- 
962 964 967 970-971 973 975 977 
9B5-990 992-993 995-996 999-1002 
1004-1009 1011 1014 1017-1019 
1022-1023 1025 1027 1029-1031 
1033-1036 1038 1041 1043 1047 
1050 1053-1054 1058-1059 1061- 
1062 1064 1068 1070 1072 1078 
1085-1086 1089-1091 1093 1097 
1106-1107.1110-1113 1115-1117 
1122-1123 1125 1129 1132-1133 
1135-1137 1140-1145 1152 1158 
1163 1168 1170-1174 1176-1178 
1180 1182-1183 1186 1195 1198- 
1200 1202 1205-1206 1211 1216 
1219-1221 1223-1227 1230-1236 
1238-1242 1247 1252 1254 1256 
1258 1261-1262 1264-1265 1269- 
1270 1272-1275 1277 1280-1284 
12B7-1293 1299-1300 1306 1308 
1312-1313 1317-1320 1322 1324- 
1330 1333-1335 1339 1341 1343- 
1347 1349 1353-1357 1359-1361 
1365-1367 1369-1370 1373-1374 
1377 1379-1381 1386-1387 1394 
1400 1403 1409 1419 1423 1425- 
1428 1430-1431 1433-1434 1437- 
1438 1440-1442 1446-1448 1450 
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1453 


1458 


-1459 


1463 


-1464 


1468 


1470- 


-1471 


1474 


1477 


-1478 


1482- 


14 8 8 


1490 


-1493 


1496 


-1501 


1504 


1506 


1509 


1512-1513 


1516 


1519 


1521-1522 




1 coc 
X D <£ D 


1527- 


1528 


1531 


1534 


1538 


1541 


1545- 


•1547 


1549- 


• 1550 


1553 


1555 


-1556 


1560 


1565 


1567 


1575 


1580 


1589 


1591 


1594 


1596 


1598 


1600 


-1602 


1606- 


1608 


1611 


1614 


1620 


-1621 


1624 


1626- 


1629 


163JL- 


1632 


1636 


1638- 


1639 


1641 


1644- 


1645 


1648- 


•1650 


1653- 


•1655 


1658- 


1660 


1662 


1669- 


1670 


1675 


-1679 


1684 


-1688 


1690- 


1692 


1696 


1700 


1702 


1707- 


1709 


1711 


1716 


-1717 


1720 


1723 


1725- 


1727 


1733 


1737- 


1738 


1741 


1743- 


1744 


1748 


-1749 


1752 


1755 


1760- 


1762 


1765 


1769 


1771- 


-1772 


17B1- 


1784 


1786 











leukocyte 



Clontech 



4 35-36 44-45 61 68- 
119 139 154 179 197 
324 372 404 430-431 
477 481 503 537-540 
581 589 608-609 621- 
632 647 662-664 669 
773 775-777 802 848 
879 905-907 915 949 
1002 1113 1119 1170 
1236-1237 1241 1275 
1357 1359 1377 1506 
1553 1591 1600 1613- 
1628 1670 1676-1677 
1699 1733 1738 1772 



LUC003 



69 75 82 102 
244 280-281 
455 461 476- 
554 575-576 
622 624 630 
679 698 764 
851 856-857 
952 990 992 
1183 1216 
1346 1353 
1515 1534 
1614 1621 
1691-1692 



melanoma from 
cell line ATCC 
#CRL 1424 



Clontech 



MEL004 



25 35-36 43 80 104 126 128 150 
163 166 188-189 197 210 215 220 
271 277 280-281 310 317 336-338 
345 351 372 380-381 383 387 412 
415-416 430 445 448 454 456 467 
481 490 499 503 526 528 546 548 
567 575-576 588 601 613 615 647 
660 665 734-735 737 759 778 787 
790 800 832 845 856 859 869 878 
883 887 905 914 932 934 958 976 
985 990 992 999-1000 1025 1031 
1038 1050 1055 1068 1074 1088 
1099-1102 1107 1136-1138 1149 
1156 1163 1172 1190 1195 1200 
1214-1215 1217 1226-1227 1235 
1238-1239 1244 1253 1278*1280 
1293 1311 1320 1330 1334-1335 
1345 1355 1367 1386-1387 1394 
1403 1406 1414 1423 1437 1442 
1465 1521 1529 1536 1539 1541 
1547-1548 1582 1620 1626 1631 
1638 1647 1653 1660 1667 1669- 
1670 1680-1681 1696 1704 1715 
1724-1725 1731-1732 1750 1760- 
1761 



mammary gland 



Invitrogen 



MKG0 01 



5-8 10 12 14 
33-39 42-43 
71 73-74 79- 
106 108 112 
146 148 150- 
166 170-172 
188-190 194- 
222 224 227 
251 253-254 
271 276-277 



-18 20-21 24 
52 55-58 60 
80 82 89 98 
123 128 133 
152 154 158 
174 176 178 
198 201-206 
228 231 233- 
256 261-263 
279-281 284 



25~29 

64 68-69 
100 103 
137 144- 
159 165- 
181-185 
210 217- 
237 247 
266-267 
286 288 
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Library Name 


















290 


297 299 30 


1 304 


309- 


312 318 








320- 


321 323-32 


5. 327 


-329 


331-332 








334 


339 341 344-345 


348 


350 356 








359- 


360 362-363 368 


371 


376 379- 








383 


388 390 393-395 


397- 


398 405 








408 


412 414-41 


5 423 


430 


434-437 








441- 


444 448 451-455 


462- 


464 474 








476 


479 482 485-486 


488 


490 494- 








495 


498 503 506 509 


-512 


516-517 








519- 


520 522 527 529 


534 


537-541 








•547 


549 554 557 562 


572- 


574 587 








589- 


591 597 602 607 


618 


623 628- 








629 


632 634-64 


D 644 


647- 


648 650- 








652 


655 657-65 


3 660 


665 


667 669- 








672 


674-676 679 682 


688 


695-696 








706- 


707 710 713 717 


720 


722-730 








732- 


734 736 73 


3 743 


747- 


748 750 








755 


759 761 766 770 


780 


784 786- 








789 


794 803 80 


5-807 


809 


814 817- 








822 


827-829 837 842 


854- 


858 863- 








■864 


866 869-870 872 


878 


881 889 








893- 


900 904 906-907 


911 


916 919 








921- 


923 926 935-937 


946 


948-949 








953- 


954 957 960-961 


963 


965-966 








970 


977-978 984-989 


993- 


997 








1000 


-1001 1005- 


-1006 


1008 


1013 - 








1014 


1016-1017 


1023 


1025 


1027 








1032 


-1033 1036 


1039 


1043 


1045 








1055 


1057-1058 


1063 


1068 


-1075 








1077 


-1078 1085 


1087 


xUO j 


-1091 








1095 


-1102 1107- 


■1108 


1112 


-1119 








1121 


-1123 1131- 


•1133 


1 13 6 


-1137 








1139 


-1142 1144- 


- 1145 


1148 


-114 9 








1153 


1159 1167 


1170 


J. X / <£. 


-1173 








1183 


-1185 1190- 


-1192 


± ± J D 


-1199 








1207 


-1208 1212 


1216- 


1 nip 
- 1 c. JL O 


1222- 








1223 


1225 1231 


1234 




-1241 








1247 


1253-1254 


1258- 




1261- 








1262 


1270-1280 


1283 


1285 


-1286 








1298 


1307 1314 


1316- 


■ 1320 


1323 - 








1325 


1330 1334- 


1335 




-1345 








1349 


-1352 1354- 


1355 


1359 


1369- 








1370 


1377 1379 


1381 


1383 


-1384 








1389 


1405 1414 


1419 


1421 


-1423 








1425 


-1426 1428- 


1429 


1431 


1434 - 








1437 


1439 144B- 


1449 


1454 


1457 








1460 


-1464 1466 


14 71 


1480 


-1483 








1487 


1489-1491 


1493 


1505 


1507 








1512 


1519 1526- 


1528 


1532 


1534 








1535 


1539 1542 


1547 


1549 


-1550 








1554 


1561-1562 


1564 


1567 


1572 








1576 


-1579 1581- 


1582 


1587 


-1588 








1592 


1594 1596- 


1597 


1601 


-1602 








1607 


-1608 1610 


1612- 


1616 


1618 








1621 


-1622 1625- 


1626 


1631 


1635- 








1636 


1641 1643- 


1644 


1647 


1650 








1652 


1654-1655 


1657- 


1658 


1660 








1662 


1664-1666 


1669- 


1671 


1673- . 








1674 


1676-1677 


1680- 


1685 


1689- 








1692 


1701 1706 


1713- 


1715 


1719- 








1720 


1723-1728 


1730- 


1732 


1738 








1740 


1742-1744 


1746- 


1747 


1749 








1751 


1753 1760- 


1762 


1765-1768 








1771 


1774 1776- 


1777 


1779 


1783- 








1784 


1786 








induced neuron 


Strategene 


NTD001 


29 35-36 80 116 


123 


156 163 181 


cells 






214 230 280-281 


284- 


285 307 321 








330 340 358 371 


375 


377 380 382 








422 424 492 497 


532- 


533 542 546 
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Tissue Origin 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



549 566 5 
734 775-7 
856 858 8 
2041-1043 
1194 1206 
1288-1289 
1349 1359 
1623 1645 



86 595 612 
78 780 792 
75 936 953 
1055 1072 
1223 1246 
1291 1294 
1412 1423 
1684 1705 



645-647 654 
799 821 826 
985 990 992 
1104 1193- 
1253 1274 
1311 1320 
1485 1620 
1715 1751 



retinoid acid 

induced 
neuronal cells 



Strategene 



NTR0 01 



5-8 78 2^8-269 277 383 431 506 " 
623 677 731 999-1000 1199 1425- 
1426 1547 



neuronal cells 



Strategene 



NTU001 



29 65-66 80 82 110 
166 174 181-185 198 
284 309 325 332 334 
391 393 406 414-416 
470 488 503 506 510 
540 572-574 597 602 
661 700 702 716 743 
904 948 954 977 100 
1025 1064 1068 1222 
1219 1226 1234 1246 
1295-1296 1311 1317 
1330 1350 1355 1365 
1383-1384 1400 1412 
1539 1547 1578 1647 
1690 1738 1749 1783 



119 146 152 
227-228 253 
336-338 375 
454 465-466 
512 519 537- 
607 623 647 
771 792 858 
0 1005-1006 
1148 1185 
1271 1283 
-1320 1329- 
1366 1378 
1445 1505 
1656 1683 
-1784 



pituitary 
gland 



Clontech 



PIT004 



311 314 379 408 419 430 454 10*5 — 
1095-1096 1272-1273 1312 1320 
1378 1652 1671 1720 1725 1736 
1741 1755 



placenta 



prostate 



rectum 



Clontech 
Clontech 



PLA003 



PRT001 



5-8 124 20 
1200 1317- 
1737 
9 46 57 



8 277 370 843 906-907 
1319 1369 1609 1621 



Invitrogen 



REC0 01 



71 107 147 171 177 197 ~ 
201 229 231 242-243 274 280-281 
307 310 317 330 358 373 382-383 
400 430 434-436 461-462 469 477 
489 497 500 505-506 513 521 526 
531-533 547 618 649 657-658 662- 
664 710 729 767 771 789 820 861 
871 874 890-891 905 938 945 963- 
964 988-989 1002 1025 1033 1045 
1061 1095-1096 1112 1125 1142 
1196 1198 1202 1232-1233 1241 
1258 1272-1273 1287 1295 1313 
1333 1341 1344 1349 1360 1362- 
1363 1367 1437 1442 1447 1475 
1478-1479 1482 1489 1513 1517 
1527 1531 1536 1598-1599 1628 
1636 1657 1680-1681 1687-1688 
1717 1738 1743-1744 



17-18 29 33 62-63 7 
113 126 146 153 158 
200 206 261 309 312 
373 388 395 408 414 
442 446 448 464 468 
540 547 567 585 589 
629 632 645-647 651 
717-719 721 725-726 
756 762-763 766 770 
825 843 849 851 881 
949 960 986 996 102 
1034 1064 1067 1070 
1108-1109 1113 1130 
1159 1172 1178 1185 
1205 1220 1225 1240 
1317-1320 1323 1334 
1351 1355 1369 1373 



1 73-74 83 86 
167-169 195 
341 344 368 
420 430 441- 
483 517 537- 
602 623 628- 
657-658 669 
738 748 750 
774 790 819 
903 909 948- 
0 1023 1033- 
1075 1086 
1139 1153 
1187-11B9 
1244 1271 
1335 1350- 
1375 1425- 
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Tissue Origin 



salivary gland 



salivary gland 



RNA Source 



Clontech 



Hyseq 
Library Name 



SAL001 



SEQ ID NOS: 



1426 1436 1439 1469 1474 1477 
1482 1546 1587-1588 1592 1596 
1610 1622 1627 1644 1658 1662 
1665-1666 1669 1675-1677 1749 
1786 



10 55 97 103 110 140 149 152 158 
198 217-218 242-243 256 301 308 
312 321 333 351 354 360 410 437 
448 473 487 494 496 501 535 555 
569-570 572-573 590-591 624 636 
651 759 762 764 768 771 788 800 
809 826 848 865 879 906-907 925 
933 963 1016 1020 1025 1040 1046 
1055 1066 1103 1150 1172 1181 
1234 1281-12B2 1288-1289 1298 
1315 1320 1333 1336-1337 1346 
1359 1373 1379 1424 1447 1449 
1474 14B2 1492 1494 1498 1511 
1523-1524 1537 1554 1596 1626- 
1627 1636 1652-1655 1658 1665 
1671-1672 1691-1692 



Clontech 



skin 
fibroblast 



SALs 03 



ATCC 



SFB001 



ATCC 



fibroblast 

skin 
fibroblast 

smalT 
intestine 



SFB002 



ATCC 



SFB003 



Clontech 



SIN001 



skeletal 
muscle 



Clontech 



SKM001 



158 326 1423 1463-1464" 



1320 1400 



262 736 1025 1253 



709 1119 1350 1631 1653 



25 142 146-147 151 155 198 203 
244 260 271 280-281 286 288 298 
301-302 308 312 334 340 371 398 
408 412 414 416 423 426-427 430 
434-435 445 452 454 478. 503 516 
519 521 523 543 547 549 555 559 
563 569-570 585 592 604 611 626 
628-629 632 650 659 681 710 714 
718 750 764 780 798 829 842 857 
859 866 887 892 894-895 901 904 
906-907 912 919 935 997-998 1000 
1007-1008 1026-1028 1044 1055 
1089 1097 1116-1117 1131 1148 
1169 1199 1219 1234 1247 1264 
1279 1316 1320 1326 1341 1343 
1349 1351 1374 1387 1398 1400 
1403 1407 1423 1428 1468 1498 
1501 1521 1550 1556 1585 1597 
1636 1638-1639 1645 1653 1656 
1662 1671 1675 1684 1691-1692 
1704 1711 1717 1719 1722 1725- 
1726 1729 1733-1734 1743-1744 
1762 1767 1780 1785 



18 20-21 82 84 101 118 134 148 
151 153 166 225-226 258 274 277 
289 329 361 412 414 424 440 452 
459 470 488 503-504 537-540 647 
660 673-675 715 773 780 766 830 
905 922 950 963 982 990 992 1020 
1047 1063 1115-1117 1121 1134 
1228 1268 1284 1298 1321 1329 
1336-1337 1343 1409 1413-1414 
1509 1599 1624 1644 1653 1712 



skeletal 
muscle 



Clontech 



SKM002 



168 1683 1712 



skeletal 

muscle 
skeletal 



Clontech 



SKMS03 



235-236 1409 



muscle 
spinal cord 



Clontech 
Clontech 



SKMS04 



235-236 



SPC001 



4 9 11 17 30-31 35-36 43 46 60 
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Tissue Origin 



adult spleen 



RNA Source 



Hyseq 
Library Name 



SEQ ID NOS: 



82 85 92 94 108 110 
167 198 204-205 210 
259 277 280-281 300 
317 372 379 387 392 
430 433 448 467 473 
526 



509 513 519 524 
547 549 551 559 



567 
625 



607 616-617 623 
652 657-658 670-671 
682 709 711 715 719 
749-750 753 775-777 
809 820 832 834-836 
855 858 861 864 871 
898 906-908 917 919 
944 970 985 990 992 
1039 1053 1059 1065 
1077 1082 1085 1097 
1116-1117 1128 1134 
1174 1192-1194 1215 
1243 1283 1294 1307 
1323 1327 1330 1350 
1356 1359 136B 1375 
1407 1423 1429 1437 
1454 1470 1482 1492 
1511 1529 1538 1548- 
1571 1578 1598 1600 
1627 1630 1639 1646 
1670 1686 1696 1740 
1771 



116 139 157 
215 229 256 
302 304 315 
419 426-427 
487 489 506 
537-540 543 
569-570 593 
637 649-650 
673 679 681- 
728-729 734 
781 789 791 
847-849 854- 
872 875 884 
924 934 942 
993 998 1013 
1072 1075 
1103 1109 
1151 1170 
1225 1241 
1312 1320 
1353-1354 
1400 1406- 
1443 1448 
1501 1508 
1549 1565 
1614 1625 
1651-1652 
1751 1755 



Clontech 



SPLcOl 



stomach 



117 312 326 348 424 426-427 431 ' 
845 866 1320 1330 1333 1344 
1355-1357 1371 1387 1397 1446 
1538 1579 1669 1686 1739 1767 



Clontech 



STO001 



thalamus 



Clontech 



10 15-16 61 68-69 100 117 149 
197 201 227-228 231 249 273 280- 
281 287 291-292 302 312 358 362 
426-427 430 446 462 475 479 535 
597 620 630 651 662-664 722 739 
780 782 785 846 919 960 964 966- 
967 976 1008 1012 1032 1042 1063 
1071 1135 1170 1208 1234-1235 
1259 1277 1280-1281 1322 1349 
1359 1369 1449 1468 1474 1478 
1487 1493 1498 1557-1559 1622 
1634 1651 1653 1729 



THA002 



9 11 25 85 87 112 137 146 180 
190 198 206 210 212-213 235-236 
239 261 268-269 279 290 301 325 
333-334 341 351 356 364-365 379 
388 393 396 419-420 441-442 458 
477 483 508 525 531 549 567 606 
608-609 647 681 715 725-727 736 
774 782 784 794 827 883 890-891 
899-900 961 997 999-1001 1004 
1034 1055 1097 1129 1144-1145 
1150-1151 1157 1172-1173 1177 
1193-1194 1208 1220 1249 1280 
1305 1345 1355 1369 1434-1435 
1440-1441 1454 1496 1546 1549 
1562 1572 1578 1590 1594 1613- 
1614 1640 1651-1652 1671 1687- 
1688 1703 1743-1744 1746-1747 
1753 



thymus 



Clontech 



THM001 



44-45 54 57-58 62-64 79 104 123 

126 134 153 193 212-213 218 242- 

243 258 274 277 279 297 301 307 

327 330 333 342 351 358 371 410 

430 445 465-466 468 471 483 487 

493 503 506 509 517 526 535 537- 
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Tissue Origin 



RNA Source 



Clontech 



Hyseq 
Library Name 



SEQ ID NOS: 



540 546 548 554 567 5B4 586 590- 
591 604 612 621 638-640 645-647 
649 656 660 665 670 698 710 720 
728 735 739 746 759 762 766*767 
775-777 780 784-785 800 802 809 
824 826 828 845 851 858-859 864 
866 870-871 878 884 887 892 899- 
900 927 930-931 967 983 986 990 
992 999 1014 1029-1030 1033 1059 
1066 1073 1103 1107 1113 1116- 
1117 1119 1140-1142 1158 1163 
1172 1177 1195 1206 1209 1213 
1216 1218-1219 1221-1222 1227 
1271 1277 1282 1320 1329 1349 
1367 1369 1383-1384 1417 1419 
1423 1425-1427 1448 1477 1488 
1493 1536 1554 1620 1644 1646 
1649 1654-1655 1661-1662 1669- 
1670 1674 1676-1677 1635-1688 
1707 1711 1731-1732 1737 



thymus 



THMc02 



636 
673 
725 



834 
864 



5-9 15-21 25 33 35- 
50-51 54-55 60 75 8 
98-100 102 105 112 
141 143 146 157 167 
211 217-219 222 224 
236 240-241 244 251 
262 268-269 286 288 
301-302 309-310 315 
327 334 342 350 352 
373 382 384 400 403 
424 430-431 436 445 
464-467 470 472 474 
497 500 504 506 513 
524 526 530-531 534 
554-555 565-566 569 
575-577 586-587 595 
612 630-632 634 
660 666-667 669 
700 703 708 720 
739 743-744 750-753 
765 767 772-779 787 
800 810 823 829 
854-856 859 861 
890-891 898 908-909 
941 949 958 961 963 
981 986 988-990 992 
100B 1014 1016 1039 
1074 1079 1089 1097 
1117 1122 1131 1140 
1145 1163 1172 1175 
1196 1198 1206 1211 
1223 1227 1234-1243 
1267 1271 1280-1281 
1308 1317-1320 1322 
1327 1330 1334-1335 
1350-1351 1355 1357 
1374 1377-1379 1386 
1392 1397 1400 1402 
1417 1423 1425-1427 
1466 1474 1477 1483 
1504 1506 1525 1536 
1566 1594 1598-1600 
1614 1621 1623 1625 
1641 1644 1647 1649 
1658 1662-1663 1671 
1681 1686-1688 1693 
1711 1717-1718 1726- 
1733 1737-1738 1743- 
1761 1771-1772 1779 



36 43-45 48 
3 87 89 93 
117 135-137 
169 192 196 
229 233 235- 
-252 256 261- 

290 295 297 
-317 321 324 
-353 360 370- 
410 414-416 
454-456 461 
-476 483 488 
516 519-520 
537-540 549 
-570 572-573 
603-604 606 
647 650 657- 
-675 678 698 
-726 731 738- 
757 759 763- 
789-790 798 
-836 841 848 
870-871 881 
913 928 933 
967 969 975 
999 1007- 
1041 1073- 
1109 1114- 
-1141 1144- 
•1177 11B6 
1216 1220 
1261-1262 
1284 1290 
1324-1325 
1339 1346 
1360 1370 
1389-1390 
1406-1407 
1440-1441 
1493 1498 
1545 1549 
1608 1611 
1632 1639 
1653-1656 
1673 1678- 
1705 1707 
1727 1731- 
1745 1758- 
1786 
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Tissue Origin 


RNA Source 


Hyseq 




SEC 


ID NOS: 








Library Name 












thyroid gland 


Clontech 


THRO 01 


4 9-10 20-21 3 


7-39 


48 50-51 54- 








57 60-61 65-66 


71 83 94- 


96 98- 








100 


102 104 110 112 


115- 


117 119 








123 


127 133 136-137 


140 


149 152- 








153 


155-158 163-164 


168- 


169 171 








186 


190-192 197 201 


-203 


219-220 








229 


233-237 246-247 


253 


256 258 








262 


265-266 268-269 


277 


280-281 








284- 


286 288-289 298 


-299 


302 309- 








311 


317 321 326 332 


335 


341-342 








344 


348 350 354 358 


-359 


363 368 








371- 


373 382-383 385 


394 


398 400- 








401 


411 414-415 421 


424 


43 0-431 








433- 


436 443-446 450 


-452 


454-455 








458 


472-474 476-478 


482 


484-485 








487- 


4B8 490-494 496 


-497 


500-501 








503- 


504 506 509-513 


516- 


517 519 








524 


526-527 529 535 


-540 


547 549 








562 


564 569-570 575 


-576 


588 594- 








595 


601-602 604 606 


610 


612 615- 








617 


619-623 628-630 


634- 


635 642 








647 


649-651 660 662 


-665 


668 670 








681 


690-694 696 698 


700 


709 721 








727- 


729 732 734 738 


740- 


741 743 








745 


750 759 761 763 


765 


770 773 








780 


785 795-796 798 


802 


804 823- 








824 


826 82B 833 838 


841- 


845 847 








849 


857-860 867 874 


-875 


878 880- 








881 


887-888 890-892 


894- 


895 898 








908 


910-911 913-914 


922- 


923 926- 








927 


929 932-934 937 


939 


941-942 








948 


953 957 961 963-964 


966 978- 








979 


981-982 987 990 


992 


1001 








1004 


-1006 1010 


1014 


1020 


1024 








1033 


1038-1039 


1044 


1047 


1050 








1052 


-1054 1056 


1058 


1068 


1070- 








1071 


1077-1079 


1088 


1094 


-1097 








1105 


-1106 1112 


-1113 


1116 


-1117 








1124 


1126 1128-1129 


1131 


1134 








1136 


-1137 1142 


-1143 


1146 


-1147 








1149 


-1150 1156 


1161-1164 


1167 








1170 


-1173 1177 


-1181 


1190 


1192 








1197 


1200 1204 


1208-1209 


1214 








1217 


1219 1222 


1230 


1232 


— ±4 J J 








1235 


1241 1245 


1247 


1254 


1257- 








1258 


1260 1262 


1271-1273 


1283 








1286 


-1289 1299 


1306 


1314 


1320 








1330 


-1332 1334- 


-1335 


1342 


1345 








1349 


1365-1367 


1370- 


1372 


1374 








1381 


1394 1407 


1419 


1428 


1436- 








1437 


1440-1441 


1443 


1446 


-1449 








1454 


1459 1461- 


-1462 


1468 


1470- 








1471 


1475 1477 


1479 


1482 


1491 








1497 


-1498 1504- 


1505 


1507 


1513 








1522 


1524-1526 


1528 


1531 


1534 








1536 


-1537 1548 


1550 


1553 


1555- 








1559 


1562 1567 


1578 


1590-1591 








1597 


1599-1601 


1612 


1614 


1616 








1619 


-1620 1622 


1624- 


1626 


1628 








1631 


-1632 1634 


1636 


1639 


1644- 








1645 


1648 1651 


1653- 


1656 


1658 








1660 


1662-1663 


1667 


1669 


1671 








1675 


1678-1681 


1683- 


1686 


1689 








1691- 


-1692 1703 


1709- 


1711 


1717 








1724- 


-1726 1729 


1734 


1737- 


•1738 








1740 


1743-1744 


1749 


1753 


1759- 








1761 


1770 1777 


1786 






trachea 


Clontech 


TRC0D1 


9 29-^1 4£ 48 8 


7 104 


107 


110 135 








158 222 262 266 


286 


301 318 331 
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iieiout; uiiyiii 


D XT TV C/-*tt>-^^ 


Hyseq 

UXtJIdiy Wdiue 






SEQ 


ID NOS: 












352 


372 377 


384 


414 


424 


445 


-446 








454 


472 4 


74 


491 


496 


560 


579 


588 








593 


597 6 


37 


612 


626 


681 


702 


719 








810 


859 866 


878 


894- 


-895 


912 


916 








922 


932 935 


104 


6 1075 1080 


1099- 








1102 


1113 


1208 


1215 


1232 


-12 


33 








1237 


1281 


1312 


1385 


1387 


14 


05 








1414 


1424 


1430 


1437 


1447 


15 


D5 








1569 


1579 


1586 


1600 


1641 


1653 








1667 


1671 


1676- 


1677 


1683 


1691- 








1692 


1711 


1717 


1726 


1772 












17 19 25 41 


46 


57-58 61 


89 


104 








108 


139 152 


174 


198 


200- 


201 


206 








263- 


265 274 


290 


387 


408 


420 


438 








446 


448 452 


4 73 


491 


493 


499 


503 








506 


513 519 


522 


526 


530 


542 


-543 








560 


601 610 


632 


659 


665 


720 


751 








773 


780 833 


845 


857 


872 


877 


912 








929 


934 937 


996 


1009 1011 1018 








1050 


1075 


1107 


1124 


1170 


1219 








1258 


1279 


1287 


1310 


1320 


1323 








1343 


-1344 


1375 


1437 


1451 


-1452 








1478 


1481 


1498 


1519 


1521 


1536 








1552 


1579 


1597 


1602 


1606 


1620 








1626 


-1627 


1649 


1652 


1661 


1670 








1719 


1722-1723 
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WATERMAN 
SCORE 


% 

IDENTITY 


1 


Y41736 


Homo 
sapiens 


Human PR01114 protein 
sequence . 


1398 


100 


2 


Y66656 


Homo 
sapiens 


Membrane -bound protein 
PR094 3 . 


2389 


99 


3 


AF113136 • 


Homo sapiens 


IL-1 receptor-associated- 
kinase-M; IRAK-M 


3043 


100 


4 


AF017806 


Mus musculus 


Zn-15 transcription factor 


6351 


77 


5 


X02761 


Homo sapiens 


fibronectin precursor 


10535 


98 


6 


X02761 


Homo sapiens 


fibronectin precursor 


8990 


89 


9 


X02761 


Homo sapiens 


fibronectin precursor 


12564 


99 


9 


AJ011679 


Homo sapiens 


Rab6 GTPase activating 
protein, GAPCenA 


5251 


99 


10 


W88501 


Homo sapiens 


Human stomach carcinoma clone 
HP104 15 -encoded protein. 


2381 


100 


11 


AF117754 


Homo sapiens 


thyroid hormone receptor- 
associated protein complex 
component TRAP24 0 


11336 


98 


12 


Z97630 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G)) ) 


896 


100 


13 


Y5B620 


Homo sapiens 


Protein regulating gene 
expression PRGE-13. 


1894 


98 


14 


AF213457 


Homo 
sapiens 


triggering receptor expressed 
on myeloid cells 2 


1238 


100 


16 


AF233453 


Homo sapiens 


RACK- like protein PRKCBP1 


3124 


qq 

if if 


17 


AF201303 


Homo sapiens 


dhfr oribeta-binding protein 
RIP60 


3130 


98 


18 


AF064205 


Homo sapiens 


dynactin l pi50 isoform 


6377 


100 


19 


U00059 


Saccharomyce 
s cerevisiae 


Yhrl21wp 


174 


26 " 


20 


AB0329O3 


Homo sapiens 


guanos ine monophosphate 
reductase isolog 


1801 


99 


21 


AB032903 


Homo sapiens 


guanosine monophosphate 
reductase isolog 


14 85 


99 


22 


AF140507 


Homo sapiens 


Ca2+/caTmodul in- dependent 
protein kinase kinase beta 


3083 1 


99 


23 


AF140507 


Homo sapiens 


Ca2+/calmodulin- dependent 
protein kinase kinase beta 


2300 


99 


24 


AJ289131 


Homo sapiens 


chondroitin 4-0- 
sulfotransf erase 


2211 


99 


25 


U33460 - 


Homo 
sapiens 


DNA-directed RNA polymerase 
I, largest subunit 


8777' " 


98 


26 


Y44488 


Homo sapiens 


ACRP3 0R2 variant protein. 


13 87 


100 


27 


U43701 


Homo sapiens 


ribosomal protein L23a 


791 


100 


28 


U02032 


Homo sapiens 


ribosomal protein L23a 


767 


p f 


29 


Y41324 


Homo sapiens 


Human secreted protein 
encoded by gene 17 clone 
HNFIY77 . 


1083 


99 


30 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2. 


715 


90 


31 


W71749 


Homo sapiens 


Human ubiquitin conjugation 
system protein 2 . 


631 


82 


32 


AF231917 


Homo sapiens 


long- chain 2 -hydroxy acid 
oxidase HAOX2 


1811 


100 


33 


Z29481 


Homo sapiens 


3-hydroxyanthranilic acid 
dioxygenase 


1507 


99 


34 


AB001451 


Homo sapiens 


Sck 


2869 


100 


35 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1667 


99 


36 


Y00644 


Homo sapiens 


precursor polypeptide (AA -34 
to 287) 


1104 


98 


37 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence. 


3586 


78 


38 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence. 


4726 


99 
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IDENTITY 


39 


Y78795 


Homo sapiens 


Human antizuai-2 (AZ-2) amino 
acid sequence . 


3556 


77 


40 


U93121 


Homo sapiens 


M-phase phosphoprotein-l 


3 74 7 


100 


41 


Y42750 


Homo sapiens 


Human calcium binding protein 
1 (CaBP-1) . 


795 


100 


42 


AF282626 


Homo sapiens 


latexin 


1189 


100 


43 


G02150 


Homo sapiens 


Human Becreted protein, SEQ 
ID NO: 6231. 


384 


94 


44 


U19617 


Mus musculus 


El£-1 


2724 


88 


45 


U19617 


Mus musculus 


Elf-l 


2062 


66 


46 


AF100758 


Homo sapiens 


osteoinductive factor OIF 


1538 


100 


47 


Y87591 


Homo sapiens 


Human SPROUTY-l protein, SEQ 
ID NO: 24. 


1737 


99 


49 


X04145 


Homo sapiens 


T3 gamma precursor (aa -22 to 
160) 


942 


99 


51 


X63547 


Homo sapiens 


oncogene 


5845 


99 


52 


M94043 


Rattus 
norvegicus 


rab- related GTP- binding 
protein 


1089 


96 


53 


L317B3 


Mus musculus 


uridine kinase 


$U 


71 


54 


X83973 


Homo sapiens 


transcription factor 


4486 


98 


55 


AF224741 


Homo sapiens 


chloride channel protein 7 


4128 


99 


*6 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


1491 


100 


57 


Z50907 


Homo sapiens 


Human TBC-1 cDNA from second 
transcript . 


4824 


100 


58 


D79994 


Homo sapiens 


similar to ankyrin of 
Chroma ti urn vinosum. 


6089 


99 


59 


D79994 


Homo sapiens 


similar to ankyrin of 
Chroma tium vinosum. 


4014 


91 


60 


Y59738 


Homo sapiens 


Human normal ovarian tissue 
derived protein 15. 


601 


100 


61 


AB031069 


Homo sapiens 


protein containing CXXC 
domain 1 


1390 


100 


62 


Y66660 


Homo 
sapiens 


Membrane -bound protein 
PR0783. 


2492 


99 


63 


Y66660 


Homo 
sapiens 


Membrane -bound protein 
PR0783 . 


1709 


99 


64 


S70011 


Rattus sp. 


tricarboxylate carrier 


895 


55 


65 


AF139518 


Rattus 
norvegicus 


A-kinase anchor protein 


178 


24 


66 


W29666 


Homo sapiens 


Homo sapiens DH1308_1 clone 
secreted protein. 


157 


30 


67 ' 


AJ24573B 


Homo sapiens 


claudin-15 


1206 


100 


68 


AF099138 


Rattus 
norvegicus 


GLUT4 vesicle protein 


4183 


87 


69 


AF099138 


Rattus 
norvegicus 


QLUT4 vesicle protein 


4906 


86 


70 


282059 


Caenorhabdit 
is elegans 


Similarity to Drosophila ring 
canal protein comes from 
this gene 


1285 


44 


71 


AF224278 


Homo sapiens 


PMEPAi protein 


1282 


100 


72 


AF126426 


Homo sapiens 


neurotrimin 


1809 


100 


73 


Y41652 


Homo 
sapiens 


Human MEK2 protein seguence . 


2065 


99 


74 


Y41652 


Homo 
sapiens 


Human MEK2 protein seguence. 


1207 


100 


75 


AF188622 


Mus musculus 


selectively expressed in 
embryonic epithelia protein- l 


1485 


74 


76 " " 


AE000406 


Escherichia 
coli 


putative DNA topoi some rase 


950 


100 


77 


X99302 


Homo sapiens 


Popl 


655 


100 


78 


AL136538 


Schizosaccha 

romyces 

pombe 


similarity to S. cerevisiae 
ktil2 protein 


210 


31 


74 


AFll975<; 


Homo sapiens 


G4 


1554 


99 
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% 
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80 


AL096768 


Homo sapiens 


dJ858B16.2 
(phosphatidylserine 
decarboxylase (PSSC, EC 
4.1.1.65)) 


2033 


100 


81 


AL096768 


Homo sapiens 


dJB58Bl6.2 
(phosphatidylserine 
decarboxylase (PSSC, EC 
4.1.1.65) ) 


1220 


96 


82 


X57351 


Homo sapiens 


1-8D 


677 


98 


83 


AC005594 


Homo sapiens 


R26984_l 


2700 


98 


84 


X73113 


Homo sapiens 


fast MyBP-C 


5959 


99 


85 


AF097330 ' 


Homo sapiens 


HI chloride channel; p64Hl; 
CLIC4 


1305 


99 


86 


AB01B423 


Mus musculus 


SH2 domain- containing protein 


1360 


78 


87 


AF272151 


Homo sapiens 


adaptor protein CIKS 


3084 


99 


88 


AF196329 


Homo 
sapiens 


triggering receptor expressed 
on monocytes 1 


1214 


100 


89 


AB016879 


Arabidopsis 
thaliana 


contains similarity to pre- 
mRNA splicing 
factor~gene_id:MRBl7 . 2 


634 


36 


90 


AJ133721 


Mus musculus 


homeodomain protein 


654 


57 


91 


AJ242864 


Mus musculus 


phtf protein 


619 


61 


92 


A61971 


unidentified 


MCSP 


11676 


99 


93 


Y99365 


Homo sapiens 


Human PRO1250 (UNQ633) amino 
acid sequence SEQ ID NO: 86. 


3890 


100 


94 


Y87231 


Homo sapiens 


Human signal peptide 
containing protein HSPP-8 
SEQ ID NO: 8. 


1031 


100 


95 


AF227741 


Rattus 
norvegicus 


protein kinase WNK1 


2428 


95 


96 


AF227741 


Rattus 
norvegicus 


protein kinase WNK1 


1961 


94 


97 


Y92513 


Homo sapiens 


Human OXRE-10. 


1626 


100 


98 


AL021366 


Homo sapiens 


CICK0721Q.3 (Kinesin related 
protein) 


3423 


100 


99 


AC005783 


Homo sapiens 


R33083 1 


1974 


99 


100 


Y95293 


Homo sapiens 


Human GEF containing NEK-like 
kinase substrate sGNK. 


4092 


99 


101 


AL118501 


Homo sapiens 


dJH91N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL05G069) ) 


1509 


100 


102 


AJ006267 


Homo sapiens 


ClpX-like protein 


3233 


100 


103 


AF100753 


Homo sapiens 


ancient ubiquitous 46 kDa 
protein AUP1 


.2042 


96 


104 


AB015982 


Homo sapiens 


serine/threonine kinase 


4718 


100 


105 


AF151074 


Homo sapiens 


HSPC240 


831 


6?4 


106 


M35522 


Canis 

f amiliaris 


GTP-binding protein (rab7) 


354 


50 


107 ' 


R99800 


Homo sapiens 


NTII-1 nerve protein, 
facilitates regeneration of 
nerve cells. 


2337 


93 


108 


AF125533 


Homo sapiens 


NADH-cytochrome b5 reductase 
isoform 


1290 


93 


109 


AC0056"l4 


Homo sapiens 


F23269 2 


3369 


99 


110 


AF064729 


Homo sapiens 


RAN binding protein 16 


3285 


100 


111 


X52425 


Homo sapiens 


interleukin 4 receptor 


4496 


100 


112 | 


Y41686 


Homo 
sapiens 


Human PR0274 protein 
sequence . 


2285 


100 


113 


W15506 


Homo sapiens 


Mitogen activating protein 
kinase ERK1. 


1991 


100 


114 


Y71071 


Homo sapiens 


Human meTnbrane transport 
protein, MTRP-16. 


1190 


99 


115 


AL049548 


Homo sapiens 


dJ398G3.1 (ortholog of rat 
CPG2 ) 


3497 


99 


116 


AF189817 


Mus musculus 


evectin-2 


1124 


90 


117 "■ 


W30891 


Homo 


Human cytostatin III protein. 


715 


99 
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sapiens 








118 


AF116618 


Homo sapiens 


PRO1038 


1469 


100 


119 


Y08915 


Homo sapiens 


alpha 4 protein 


1748 


100 


120 


AP098070 


Drosophila 
melanogaster 


Li si homo log 


192 


39 


121 


AF052432 


Homo sapiens 


katanin p8 0 subunit 


181 


37 


122 


Y-70743 


Homo sapiens 


PSEQ-1 protein encoded by 
NSEQ gene associated with 
matrix remodelling. 


2637 


98 


123 


AF083246 


Homo sapiens 


HSPC028 


2132 


100 


124 


Y27096 


Homo sapiens 


Human viral receptor protein 
(ACVRP) . 


833 


99 


125 


M63109 


Leishmania 
major 


glycoprotein 96-92 


172 


27 


126 


U75467 


Drosophila 
melanogaster 


Atu 


935 


36 


127 


Z68220 


Caenorhabdit 
is elegans 


Similarity to Human ADP/ATP 
carrier protein 


438 


43 


126 


AF095927 


Rattus 
norvegicus 


protein phosphatase 2C 


1927 


94 


129 


W92958 


Homo sapiens 


Human zsig4 4 protein. 


463 


100 


130 


AF115391 


Lactobacillu 
s sakei 


ribokinase RbsK 


508 


37 


131 


X93498 


Homo sapiens 


21-Glutamic Acid-Rich Protein 


1250 


100 


132 


X93498 


Homo sapiens 


21-Glutamic Acid-Rich Protein 


916 


87 


133 


W52811 


Homo sapiens 


Human DBI/ACBP -like protein 
(DBIH) . 


705 


97 


134 


Y84444 


Homo sapiens 


Amino acid sequence of a 
human RNA-asBociated 
protein. 


3230 


100 


13b | M69181 


Homo sapiens 


non-muscle myosin B 


189 


20 


lib 


W74882 


Homo sapiens 


Human secreted protein 
encoded by gene 154 clone 
HE6FL83. 


480 


100 


137 


W78200 


Homo sapiens 


Human secreted protein 
encoded by gene 75 clone 
HHGACJ8 1 . 


855 


99 


138 


AL033520 


Homo sapiens 


dJ349A12.1 (similar to 
KIAA0701 protein) 


424 


39 


139 


AF020261 


Santalum 
album 


proline rich protein 


119 


30 


140 


X70394 


Homo sapiens 


zinc finger protein 


1634 


100 


141 


Y06439 


Homo sapiens 


Human protease HUPM-8. 


936 


100 


142 " 


Z68493 


Caenorhabdit 
is elegans 


predicted using Genefinder 


365 


42 


143 


AB018107 


Arabidopsis 
thai i ana 


ADP-ribosylation factor-like 
protein 


596 


65 


144 


AF161483 


Homo sapiens 


HSPC134 


580 


51 


li& 


Y84902 


Homo sapiens 


A. human proliferation and 
apoptosis related protein. 


480 


100 


146 


AB004906 


Ipomoea 
purpurea 


transposase 


146 


20 


147 


AC007357 


Arabidopsis 
thaliana 


F3F19.18 


647 


31 


148 


W75155 


Homo sapiens 


Human secreted protein 
encoded by gene 41 clone 
HNTME13 . 


1494 


98 


149 


AF056490 


Homo sapiens 


cAMP- specif ic 
phosphodiesterase 8A 


3710 


99 


ISO 


Y58171 


Homo 
sapiens 


Human hydrolase homologue 
HHH-7 . 


785 


99 


151 ~ 


U10397 


Saccharomyce 
s cerevisiae 


Yhrl48wp 


515 


53 


152 


X7347B 


Homo sapiens 


phosphotyrosyl phosphatase 
activator 


1719 


99 


153 


AL049697 


Homo sapiens 


dJ382H0.5.l (novel protein 


2034 


99 
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similar to arginyl - tRNA) 






lb4 


AF169802 


Homo sapiens 


cytochrome b5 reductase b5R.2 


1455 


99 


155 


X94703 


Homo sapiens 


rab28 


1126 


99 


156 


Y25716 


Homo sapiens 


Human secreted protein 
encoded from gene 6 . 


1471 


100 


158 


W77404 


Homo sapiens 


Secreted salivary polypeptide 
zsig32 . 


93 7 


100 


159 


Y17248 


Homo sapiens 


Human protein kinase 
inhibitor-2 (PKI-2) . 


383 


100 


160 


J04970 


Homo sapiens 


carboxypeptidase M precursor 


2395 


100 


161 


W54040 


Homo sapiens 


Human interferon-inducible 
protein, HIFI. 


484 


98 


162 


AL022724 


Homo sapiens 


dJ413H6.1.l (hamster 
Androgen- dependent Expressed 
Protein LIKE PUTATIVE 
protein) (isoform 1) 


1357 


100 


163 


AF125535 


Homo sapiens 


pp21 homolog 


193 


45 


164 


G03632 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7713 . 


463 


97 


165 


AJ250839 


Homo sapiens 


serine/threonine protein 
kinase 


1442 


71 


166 


L09649 


Zymomonas 
mobilis 


zm2 


173 


37 


167 


Y73337 


Homo sapiens 


HTRM clone 1944530 protein 
sequence. 


1204 


100 


168 


W88645 


Homo sapiens 


Secreted protein encoded by 
gene 112 clone HUKFC71 . 


1084 


100 


169 


AF214731 


Homo sapiens 


ATP-dependent RNA hel lease 


4402 


100 


170 


AE000871 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


166 


27 


171 


Y27^84 


Homo sapiens 


Human secreted protein 
encoded by gene No. 118. 


821 


100 


172 


AF226044 


Homo sapiens 


HSNFRK 


2904 


100 | 


173 


AJ245946 


Homo sapiens 


neuroglobin 


779 


106 


174 


D43949 


Homo sapiens 


This gene is novel . 


3202 


100 ] 


175 


Y07923 


Homo sapiens 


GTP-bmding protein H 


1205 


100 


176 


W90338 

- 


Homo 
sapiens 


Human DPI homologue protein. 


956 


100 


X I 1 


Y41675 


Homo sapiens 


Human channel -related 
molecule HCRM-3 . 


1122 


100 


178 


Y41674 


Homo sapiens 


Human channel -related 
molecule HCRM-2 . 


936 


99 


1 7Q 




Homo sapiens 


krueppel-like zinc finger 
protein HZF2 


4100 


93 


180 


X03084 


Homo sapiens 


Clq B- chain precursor 


1240 


100 


1 oi 
-IwJ, 


Ub / J4 4 


Mus musculus 


Meis3 


1813 


89 


JL 0 J 


Ub /344 


Mus musculus 


Meis3 


1743 


86 


J.04 


U57344 


Mus musculus 


Meis3 


1070 


86 


185 


AF033120 


Homo sapiens 


p53 regulated PA2^-T2 nuclear 
protein 


1389 


58 


lob 


AF200357 


Mus musculus 


pantothenate kinase 1 beta 


1605 


82 


187 


W75058 


Homo sapiens 


Human secreted protein 
encoded by gene 2 clone 
HLDBG33. 


1188 


99 


188 


AJ292529 


Homo sapiens 


suppressor of sterile four 1 


2424 


100 


190 


ye An -a a 


Homo sapiens 


protein- tyrosine phosphatase 


3705 


100 


191 


Y22203 


Homo sapiens 


Human calcium- binding 
phosphoprotein, CBPP-i, 
protein sequence. 


1083 


99 


192 


W53692 


Homo 
sapiens 


Human secreted protein 12. 


1975 


100 


193 - ' 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


2605 


99 
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194 


AF084259 


Mus musculue 


bromodomain- containing 
protein BP75 


693 


54 


195 


Y00752 


Rattus 
norvegicus 


serine dehydratase (AA 1 - 
327) 


994 


61 


196 


W95349 


Homo sapiens 


Human foetal brain secreted 
protein fhl70_7. 


2*96 


100 


197 


AB028859 


Homo sapiens 


hDj9 


1890 


100 


198 


W95633 


Homo sapiens 


Homo sapiens secreted protein 
gene clone hm236_l. 


1614 


100 


199 


Y44277 


Homo 
sapiens 


Human nucleic acid methylase- 
2 . 


2096 


99 


200 


AB030039 


Homo sapiens 


hPACPLl 


2258 


100 "» 


201 


X54162 


Homo sapiens 


64 Kd autoantigen 


2918 


99 


202 


G02061 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6142. 


558 


99 


203 


X13885 


Nicotiana 
tabacum 


extensin (AA 1-620) 


185 


33 


204 


J04204 


Bos taurus 


32 kd accessory protein 


1837 


100 


205 


J04204 


Bos taurus 


32 kd accessory protein 


1101 


100 


207 


Y87283 


Homo sapiens 


Human signal peptide 
containing protein HSPP-60 
SEQ ID NO: 60, 


1318 


100 


208 


Y02860 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 65. 


936 


96 


209 


AKL21889 


Homo sapiens 


dJ1076E17.1 (KIAA0823 protein 
(continues in AL023803)) 


694 


54 


210 


AF226732 


Homo sapiens 


NPD007 


1345 


76 


211 


X66295 


Mus musculus 


Clq C chain 


970 


73 


212 


Z29328 


Homo sapiens 


Ubiqui tin- conjugating enzyme 
UbcH2 


966 


100 


213 


Z29328 


Homo sapiens 


Ubiqui tin-conjugating enzyme 
UbcH2 


542 


98 


214 


AJ002030 


Homo sapiens 


progresterone binding protein 


1163 


100 


215 


X70649 


Homo sapiens 


member of DEAD box protein 
family 


3933 


100 


216 


AF250558 


Homo sapiens 


claudin-2 


1169 


99 


217 


AL021453 


Homo sapiens 


dJ821Dll.l (PUTATIVE protein) 


259 


100 


218 


Y08565 


Homo sapiens 


UDP-GalNAc: polypeptide N- 

acetylgalactosaminyltransfera 

se 


3331 


99 


219 


Y94452 


Homo sapiens 


Human inflammation associated 
protein 


2067 


100 


220 


AL035521 


Arabidopsis 
thaliana 


putative protein 


315 


42 


221 


AL031786 


Schizosaccha 

romyces 

pombe 


putative proline-trna 
synthetase 


811 


41 


222 


AL109736 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


626 


40 


223 


X52493 


Glycine max 


DNA-directed RNA polymerase 


136 


23 


224 


AL035659 


Homo sapiens 


dJ979Nl.l <dJ979Nl.l> 


£*1$9 


98 


22S 


AB032401 


Mus musculus 


mmDj4 


1761 


92 


226 


AB032401 


Mu3 musculus 


mmD34 


1988 


92 


227 


X83502 


Saccharomyce 
s cerevisiae 


J1007 


112 


26 


228 


X83502 


Saccharomyce 
s cerevisiae 


J1007 


79 


"2Ei 


229 


AF143723 


Homo sapiens 


heat shock protein HSP60 


2557 


99 


230 


Y66677 


Homo 
sapiens 


Membrane -bound protein 
PR0828 . 


982 ; 


100 


231 


AB027466 


Homo sapiens 


spondin 2 


1756 


99 


232 


W95634 


Homo 
sapiens 


Homo sapiens secreted 
protein. 


1391 


100 I 


233 


W00365 


Homo sapiens 


Human cyclin Bl . 


2218 | 


99 


234 


Y53762 


Homo sapiens 


A GTP-binding polypeptide 


1017 


100 



148 



WO 01/53312 



PCT/US00/34263 



TABLE 2 



SEQ 
ID 
NO: 
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SPECIES 
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SMtTH- 
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SCORE 


% 

IDENTITY 








designated RAQ. 






235 


Z50749 


Homo sapiens 


yeast eds22 homolog 


1800 


100 


236 


Z5G749 


Homo sapiens 


yeast sds22 homolog 


1754 


98 


237 


AB026491 


Homo sapiens 


PICK1 


2137 


100 


238 


AJ270205 


Entodinium 
caudatum 


putative 

phosphatidylinositol-4 - 
phosphate 5-kinase 


114 


37 


239 


AB030189 


Mus musculus 


contains transmembrane (TM) 
region and ATP binding region 


710 


93 


240 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3785 


99 


241 


W56538 


Homo sapiens 


Human hedgehog interacting 
protein (HIP) . 


3436 


99 


242 


AF155107 


Homo sapiens 


NY-REN-37 antigen 


996 


99 


243 


AF155107 


Homo sapiens 


NY -REN- 3 7 antigen 


1005 


100 


244 


AL031320 


Homo sapiens 


dJ20N2.1 (novel protein 
similar to yeast and 
bacterial cytosine 
deaminase) 


763 


99 


245 


U37026 


Rattus 
norvegicus 


sodium channel beta 2 subunit 


162 


30 


246 


AL078599 


Homo sapiens 


dJ991C6.1 (novel protein 
similar to C. elegans 
F55A12.9 (Tr:P91086) ) 


2391 


98 


247 


U32274 


Saccharomyce 
s cerevisiae 


Ydr386wp; CAI: 0.12 


191 


37 


24 8- ■ " 


Y41719 


Homo 
sapiens 


Human PROS 64 protein 
sequence . 


1879 


100 


249 


AB029434 


Homo sapiens 


ghrelin precursor 


611 


100 


250 


X97831 


Rattus 
norvegicus 


carnitine/acylcarnitine 
carrier protein 


246 


3 8 


251 


WB0993 


Homo 
sapiens 


Human RIP- interacting factor 
RIF. 


1724 


100 


252 


Y94873 


Homo 
sapiens 


Human protein clone HP02632. 


1876 


100 


253 


W5987B 


Homo sapiens 


Amino acid sequence of the 
cDNA clone AIF-2 (HEBGM49) . 


765 


100 


254 


AL354533 


Leiehmania 
major 


possible adenylate kinase 


265 


34 


255 


AF233322 


Mus musculus 


zinc transporter like 2 


1916 


95 


256 


Y78113 


Homo sapiens 


Human cytokine signal 
regulator CKSR-1 SEQ ID 
NO:l. 


2247 


99 


257 


AL035539 


Arabidopsis 
thai i ana 


putative amino acid transport 
protein 


390 


27 


258 


W74787 


Homo sapiens 


Human secreted protein 
encoded by gene 58 clone 
HHFHN61 . 


1171 


10d "' 


259 


AL635^89 


Homo sapiens 


dJ187Jll.l {novel protein 
similar to protein kinase C 
inhibitors) 


974 


100 


260 


AE000909 


Methanobacte • 
riiim 

thermoautotr 
ophicum 


serine/ threonine protein 
kinase related protein 


363 


30 


261 


AL050131 


Homo sapiens 


hypothetical protein 


626 


100 


262 


AF019661 | 


Mus musculus 


zeta proteasome chain; PSMA5 


1214 


100 


263 


AL035593 


Homo sapiens 


dJ310J6.1 (novel protein) 


821 


100 


264 " 


AL022318 


Homo sapiens 


bK150C2.3 (PUTATIVE novel 
protein similar to APOBEC1) 


1072 


100 


265 


AF205940 


Homo sapiens 


endomucih 


1289 


100 


~266~ 


AL023583 


Homo sapiens 


dJ500Ll4.1 (novel protein) 


789 


100 


267 


A1.034548 


Homo sapiens 


dJ1103G7.3 (novel protein 
kinase domains containing 
protein similar to 
phosphoprotein C8FW) 


1888 


99 



149 



WO 01/53312 



PCT/US00/34263 



TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 
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SCORE 


IDENTITY 


268 


AF161470 


Homo sapiens 


HSPC121 


1884 


98 


269 


AF161470 


Homo sapiens 


HSPC121 


1232 


96 


270 


X90763 


Homo 
sapiens 


HHa5 hair keratin type I 
intermediate filament 


2190 


99 


271 


AF207600 


Homo sapiens 


ethanolamine kinase 


1952 


100 


272 


M32334 


Homo sapiens 


intercellular adhesion 
molecule 2 


1436 


100 


273 


AF161483 


Homo sapiens 


HSPC134 


663 


61 


274 


Y53052 


Homo sapiens 


Human secreted protein clone 
df202_3 protein sequence SEQ 
ID NoTllO. 


587 


100 


276 


Y77576 


Homo sapiens 


Human cytoskeletal protein 
(HCYT) (clone 2195418) . 


762 


100 


277 


AF077042 


Homo sapiens 


3 OS ribosomal protein S7 
homolog 


1269 


100 


278 


Y94907 


Homo sapiens 


Human secreted protein clone 
cal06_19x protein sequence 
SEQ ID NO: 20. 


1619 


98 


279 


Y68788 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-20. 


2801 


99 


280 


Z75134 


Can is 

familiaris 


rod transducin 


1816 


100 


281 


Z75134 


Can is 

familiaris 


rod transducin 


1718 


96 


282 


AF249873 


Homo sapiens 


muscle-specific protein 


1395 ' " 


100 


283 


AL050007 


Homo sapiens 


hypothetical protein 


405 


98 


284 


AF201931 


Homo sapiens 


DC1 


1859 


99 


285 


AF156102 


Homo sapiens 


ELL complex EAP30 subunit 


1318 


99 


286 


Y35897 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
146. 


1250 


99 


287 


U88964 


Homo sapiens 


HEM45 


923 


100 


288 


AL050143 


Homo sapiens 


hypothetical protein 


598 


100 


289 


AJ011098 


Homo sapiens 


telethonin 


574 


100 


290 


Y66724 


Homo 
sapiens 


Membrane-bound protein 
PR0836. 


2321 


100 


291 


AF034801 


Homo sapiens 


Iiprin-alpha4 


2565 


98 


292 


AF034801 


Homo sapiens 


liprin-alpha4 


2590 


100 


293 


AL049851 


Homo sapiens 


dJ889J22B.l (novel protein 
(isoform 1) } 


1738 


100 


294 


Y73348 


Homo sapiens 


HTRM clone 8396S1 protein 
sequence . 


1245 


99 


295 


L11672 


Homo sapiens 


zinc finger protein 


1694 


44 


296 


AL035423 


Homo sapiens 


dJ20I3.1 (brain mitochondrial 
carrier protein- 1 (BMCP1) ) 


1024 


79 


297 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
factor-1 


2173 


100 


298 


AF161417 


Homo sapiens 


HSPC299 ' 


1147 


85 


299 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


1236 


99 


300 


U26397 


Rattus 
norvegicus 


inositol polyphosphate 4- 
phosphatase 


160 


30 


301 


AF036145 


Homo sapiens 


meningioma -expressed antigen 
5 


3458 


100 


302 


Z82022 


Homo sapiens 


GlcNac-l-P transferase 


2067 


99 


303 


AF269232 


Mus musculus 


butyrophilin-like protein 
BUTR-1 


271 


50 


304 


AJ222644 


Arabidopsis 
thaliana 


asparaginyl-tRNA synthetase 


659 


50 


305 


AF0541BO 


Homo 
sapiens 


hematopoietic cell derived 
zinc finger protein 


351 


79 


30<J 


AJ272079 


Homo sapiens 


APOBEC-1 stimulating protein 


3056 


100 


308 


Y44486 


Homo ~* 
sapiens 


Human GPRW receptor 
polypeptide . 


1721 


100 


309 


AJ131891 


Homo sapiens 


DNA polymerase mu 


2598 


100 



150 



WO 01/53312 



TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO; 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
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IDENTITY 


310 


AF293335 


Homo sapiens 


p3 0 DBC 


1248 


92 


311 


AF176525 


Mus musculus 


F-box protein FBL12 


1501 


93 


312 


X57802 


Homo sapiens 


immunoglobulin lambda light 
chain 


959 . 


81 


313 


Z36715 


Homo sapiens 


Net 


2048 


98 


314 


AF161532 


Homo sapiens 


HSPC04 7 


727 


100 


315 


AP208068 


Homo sapiens 


kelch-like protein KLHL3a 


3046 


100 


316 


Y66666 


Homo 
sapiens 


Membrane -bound protein 
PR01013. 


1166 


100 


317 


Y29666 


Homo sapiens 


Human Ras protein RAPR-1. 


1253 


98 


318 


AJ387747 


Homo sapiens 


sialin 


2614 


99 


319 


AF161362 


Homo sapiens 


HSPC099 


224 


40 


320 


Y68773 


Homo sapiens 


Ammo acid sequence of a 
human phosphorylation 
effector PHSP-5. 


2243 


99 


321 


AJ238379 


Homo sapiens 


putative TH1 protein 


3013 


100 


322 


AB040812 


Homo sapiens 


protein kinase PAK5 


3792 




323 


Y95013 


Homo sapiens 


Human secreted protein 
vc48_l, SEQ ID NO: 66. 


913 


100 


324 


Y13381 


Homo sapiens 


Amino acid sequence of 
protein PR0271. 


1976 


100 


325 


Y94944 


Homo sapiens 


Human secreted protein clone 
bfl57__16 protein sequence 
SEQ ID NO : 94 . 


2305 


QO j 


326 


Y768B4 


Homo sapiens 


Retinoblastoma binding 
protein- 7sequence . 


6728 


99 


327 


AF198532 


Homo sapiens 


lymphoid enhancer binding 
f actor-1 


2173 


100 


328 


Z78013 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin-related tumor 
suppressor 


569 


33 


329 


AF212921 


Mus musculus 


MMTV receptor variant 1 


484 


94 


330 


Z75330 


Homo 
sapiens] 
>R65207 
R65207 02- 
MAR-1995 27- 
AUG- 1993 
Human 

stromal in- l . 

(Homo 

sapiens 


nuclear orofpin qz\.~i 


6492 


99 


331 


AL008583 


Homo sapiens 


dJ327Jl6.3 (supported by 
GENS CAN, FGENES and GENEWISE) 


2133 


99 


332 


Y36104 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
489. 


310 


41 


333 


AJ271669 


Homo sapiens 


putative sialoglycoprotease 


1747 


100 


334 


AF156598 


Mus musculus 


p53 -regulated DDA3 


997 


64 


335 


M99058 


Eimeria 
maxima 


emlOO gene is homologous the 
Eimeria tenella gene etlOO 


154 


26 


336 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
<Hs-UNC-53/l) sequence. 


3386 


"97 


337 


YB5564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/l) sequence. 


2602 


94 


338 


Y85564 


Homo sapiens 


Human homologue of UNC-53 
(Hs-UNC-53/1) sequence. 


3447 




339 


Z66561 


Caenorhabdit 
is elegans 


Similarity to Human rabl3 
protein (PIR Acc . No. 
A49647) . 


716 


34 


340 


AB021643 


Homo 
sapiens 


gonadotropin inducible 
transcription repressor-3 


2761 


99 


341 


G01946 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6027. 


465 


98 


342 


AF020591 


Homo sapiens 


zinc finger protein 


1091 


48 


343 


L29154 


Homo sapiens 


immunoglobulin heavy chain 


439 . 


84 
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IDENTITY 








VDJ region 






344 


U10281 


Sus scrofa 


gastric mucin 


279 


24 


345 


AK000404 


Homo sapiens 


unnamed protein product 


1177 


99 


346 


L22557 


Rattus 
norvegicus 


calmodulin-binding protein 


1949 


84 


347 


L22557 


Rattus 
norvegicus 


calmodulin-binding protein 


23*3 


91 


348 


AL049481 


Arabidopsis 
thai i ana 


AIGl-like protein 


316 


30 ; 


3SO 


AJ251516 


Mus musculus 


cysteine and hist idine-rich 
protein 


1460 


99 


351 


AK024477 


Homo sapiens 


FLJ00070 protein 


1773 


100 


352 


U50133 


Homo sapiens 


ankyrin 


502 


33 


353 


AK000625 


Homo sapiens 


unnamed protein product 


721 


100 


354 


AF161420 


Homo sapiens 


HSPC3 02 


2623 


97 


355 


AJ010014 


Homo sapiens 


M96A protein 


1269 


47 


356 


AF151029 


Homo sapiens 


HSPC195 


941 


91 


357 


AL022327 


Homo sapiens 


dJ355C18.1 (KIAA0027) 


1911 


100 


358 


W78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 
HOSBI96. 


1117 


100 


359 


X03414 


Drosophila 
melanogaster 


Kr polypeptide 


316 


45 


360 


AF151079 


Homo sapiens 


HSPC24 5 


643 


100 


361 


Y53BB6 


Homo sapiens 


A suppressor of cytokine 
signalling protein 
designated HSCOP-6. 


530 


41 


'362 


AF254741 


Drosophila 
melanogaster 


Centaurin Gamma 1A 


681 


46 


363 


AF213465 


Homo sapiens 


dual oxidase 


2016 


100 


364 


AF181562 


•Homo sapiens 


proSAAS 


1319 


100 


365 


AF181562 


Homo sapiens 


proSAAS 


1024 


99 


366 


U73200 


Mus musculus 


pll6Rip 


884 


82 


3*7 


AF263744 


Homo sapiens 


erbb2- interacting protein 
ERBIN 


4 973 


99 


368 


U37501 


Mus musculus 


laminin alpha 5 chain 


5867 


72 


369 


AF043695 


Caenorhabdit 
is elegans 


similar to the protein 
phosphates 2c family 


549 


J o 


370 


Y l 73440 


Homo sapiens 


Human secreted protein clone 
yj23_l protein sequence SEQ 
ID NO: 102. 


1484 


99 


371 


AF272833 


Homo sapiens 


misato 


2869 


97 


372 


AF198454 


Homo sapiens 


epithelial protein lost in 
neoplasm beta 


3927 


100 


373 


Y73345 


Homo sapiens 


HTRM clone 4382 83 protein 
sequence . 


273 


80 


374 


AF169017 


Homo sapiens 


formiminotransf erase 
cycl odeaminase 


2717 


98 


375 


A9516* 


unidentified 


RED ALPHA 


1202 


99 


37* 


W74828 


Homo sapiens 


Human secreted protein 
encoded by gene 100 clone 
HLQAB52 . 


1012 


99 


377 


Y32131 


Homo sapiens 


Human LYST-2 protein. 


3556 


99 


378 


M14912 


Homo sapiens 


pol 


132 


86 


379 


AF090934 


Homo sapiens 


PRO0518 


382 




380 


X66363 


Homo sapiens 


serine/threonine protein 
kinase 


2499 


100 


381 


Y41699 


Homo 
sapiens 


Human PRO703 protein 
sequence . 


2362 


100 


382 


AF174498 


Homo sapiens 


GR AF-1 specific protein 
phosphatase 


7008 


98 


383 


U64608 


Caenorhabdit 
is elegans 


coded for by c. elegans cDNA 
ykl73cl2.5 


24* 


36 


384 


U50133 


Homo sapiens 


ankyrin 


502 


33 


383 


AJ238520 


Homo sapiens 


putative transcription 
factor-like nuclear regulator 


4123 


97 
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SCORE 


% 

IDENTITY 


387 


/ir Z U O 0 fi D 


Homo s api en s 


BM - U 0 3 


1375 


99 


3 89 


X57 821 


Homo sapiens 


immunoglobulin lambda light 
chain 


797 


76 


390 


AF182404 


Homo sapiens 


mitochondrial uncoupling 
protein 1 


1670 


99 


~~m — 




Homo sapiens 


Human homologue of UNC-53 
ins - UiMv-- 3j / x ; sequence. 


3386 


97 


393 


AF178432 


Homo sapiens 


SH3 protein 


3700 


100 


394 




rimonnh i la 
UL UoOpXl XXOL 

melanogaster 


cytoplasmic protein 89BC 


1616 


62 


395 


AF181721 


Homo sapiens 


RU2S 


2254 


100 


396 




Homo sapiens 


Amino acid sequence of a 
human betaiv- spectrin 
protein. 


1626 


98 


j y f 


VIA U T 1 U 


Mus musculus 


zinc finger protein neuro-d4 


749 


60 


TOO 


>Uj J y t/x J / 


Homo sapiens 


hypothetical protein 


263 


51 


399 


AF217525 


Homo sapiens 


Down syndrome cell adhesion 
molecule 


5337 


60 


4 0 0 


TV T 1 "i IT <■» 

AL022599 


Schizosaccha 

romyces 

pombe 


WD repeat protein 


447 


27 


401 


AC004B59 


Homo sapiens 


similar to 2-oxoglutarate 
dehydrogenase similar to 
Q02218 (PID:gl352618) 


4176 


78 


402 


AB010266 


Mus musculus 


tenascin-X 


10246' 


62 


403 


AL133288 


Homo sapiens 


dJ67lD7.1 (similar to 
D. melanogaster CG5986 
protein) 


761 


100 


404 


Z68753 


Caenorhabdit 
is elegans 


ZC518 .3b 


888 


48 


405 


Z78 01 3 


Caenorhabdit 
is elegans 


Similarity to Drosophila 
Cadherin- related tumor 
suppressor 


569 


33 


406 


AB031230 


Homo sapiens 


protein containing CXXC 
domain 2 


1196 


97 


40 7 




Homo sapiens 


NY-REN-36 antigen 


1168 


100 


408 


Y57945 


Homo sap i ens 


Human transmembrane protein 
HTMPN-69. 


1538 


99 


409 


Z18361 


Ovis aries 


trichohyalin 


184 


30 


410 




Homo sapiens 


RnoGEF 


2733 


100 


411 


AF176529 


Mus musculus 


F-box protein FBX13 


2072 


94 


*± X A 


Hi? ziuo^z 


Homo sapiens 


HARP 


4880 


100 


413 


AL031658 


Homo sapiens 


dJ310O13.7 (novel protein 
similar to H. roretzi HRPET- 
3 ) 


776 


98 


414 


yc 7 -a op 
AO / J JO 


Homo sapiens 


pm5 protein 


6131 


99 


415 


AB029826 


Homo sapiens 


3 -methyl cro tony 1 - CoA 
carboxylase biotin-containing 
subunit 


2961 


99 


H. X D 




Saccharomyce 
s cerevisiae 


Lphlp 


115 


42 


^±X / 




Lelshmania 
ma j or 


possible t26fl7.21 


239 


35 


41 8 


vnpi nn 


Homo sapiens 


Human PR0331 protein. 


330 


29 


*i J- J 


Til c*i "3 n 

UlJlJl 


Homo sapiens 


pl26 


2228 


54 


420 


AF117946 


Homo sapiens 


Link guanine nucleotide 
cAcnanyc laccor xx 


2363 


100 


421 


AF190635 


Drosophila 
melanogaster 


anicyrin 2 


755 


30 


422 


AF302150 


Homo 
sapiens 


phosphoinositol 3 -phosphate- 
binding protein-2 


1962 


100 


423 


AL13 753 0 


Homo sapiens 


hypothetical protein 


433 


94 


424 


X63753 


Homo sapiens 


son-a 


7269 


100 


425 


AB027249 


Homo sapiens 


MAPKK like protein kinase 


1693 


100 


426 


AF279144 


Homo sapiens 


tumor endothelial .marker 7 
precursor 


1084 


55 
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CUT rjiTT 

SCORE 


] % 


427 


AF279144 


Homo sapiens 


tumor endothelial marker 7 
precursor 


1259 


56 


428 


AE0036B3 


Drosophila 
melanogaster 


CG8312 gene product 


149 


29 


429 


Y07B29 


Homo sapiens 


RING finger protein 


2201 


Q Q 


430 


AF096897 


Drosophila 
melanogaster 


pushover 


4442 


47 


431 


U41387 


Homo sapiens 


Gu protein 


4021 


99 


432 


AF023674 


Homo sapiens 


nephrocystin 


3783 


100 


433 


AF146760 


Homo 
sapiens 


sept in 2-like cell division 
control protein 


2284 


inn 


434 


AB006697 


Arabidopsis 
thaliana 


cleft lip and palate 

SSSOcistsd t" T~?i n ^m^mh n ^ 

protein- like 


886 


42 


437 


Y94247 


Homo sapiens 


Human calcium binding protein 
hCBP . 


1704 


100 


438 


AB040672 


Homo sapiens 


UDP-GalNAc: polypeptide N- 
ace tvlcralac tosaminvl hranqfpra 

se 


1075 


63 


439 


AF105228 


Bos taurus 


tuf tel i n 


cos 


33 


440 


R06463 


Homo sapiens 


D£*lfiV£»ri OTOt" pin nf r> 1 nno 
lj^o. j.vcu ^lutein UL ^lUHC 

ICA13 (ATCC 4 0553) . 


i nil 


99 


441 


X14971 


Mus musculus 


aloha-adaotin (A) f aa 1-977) 


*±t33 / 


98 


442 


X53773 


Rattus 
norvegicus 


aloha-r 1 a vctf* rhain f nn 1 — 

93 8) 


J J i5 


81 


443 


Y66639 


Homo 
sapiens 


Msrnhranp-hniinrf nrnhpi n 
PROH36 . 




99 


444 


AC067754 


Arabidopsis 
thaliana 


unknown protein; 20348-23707 


114 


33 


445 


AF229032 


Mus musculus 


piL 


2077 


93 


446 


AF056035 


Rattus 
norvegicus 




2662 


85 


447 


AF132484 


Mus musculus 




4 78 


51 


448 


W89024 


Homo sapiens 


Polypeptide fragment encoded 

Jr yciic j. -J a • 


528 


45 


449 


AF161445 


Homo sapiens 


HSPC32 7 


1606 


100 


450 


Z68753 


Caenorhabdi t 
is elegans 




951 


49 


451 


W39160 


Homo sapiens 


Human narhi a 1 mmnl *=»tth=»yi1- 

factor H protein fragment 3 . 




32 


452 


W85727 


Homo 
sapiens 


Novel nrotpin ( c~\ nnp> 
BM46JL0) . 


*5 *7 QQ 


99 


"453 




Homo sapiens 


A bone marrow secreted 
protein desicrnated RMS 1 1 R 


2810 


100 


454 


D87438 


Homo 
sapiens 


Similar to a C. elegans 
protein in cosmid C14H10 


4069 


100 


455 


AF24 046B 


Homo sapiens 


n "i ra«it" rin 


3 687 


100 


456 


Z1S005 


Homo sapiens 


CENP-E 


13305 


99 


457 


M59216 


Homo 
sapiens 


gamma -aminobutyric acid 
receptor beta-l subunit 


2477 


100 


458 


Y73467 


Homo sapiens 


Human secreted protein clone 
yd61_^l protein sequence SEQ 
ID NO: 156. 


966 


100 


"459 


W67824 


Homo sapiens 


Human secreted protein 
encoded by gene IB clone- 


535 


100 


460 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


279 


19 


461 


D87446 


Homo sapiens 


Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (U40419) 


9196 


99 


462 


G04044 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8125. 


486 


93 


463 


AC002398 


Homo sapiens 


F25965_l 


1018 


100 


464 


AF064856 


Rattus ep. j 


7acomp protein 


1845 


84 


465 


AF223408 


Homo sapiens 


B99 


3686 


99 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

WA1 xt»KrlAN 
SCORE 


% 

IDENTITY 


466 


AF223408 


Homo sapiens 


B99 


O Q 1 Q 

it) / O 


an ■ " ' ' 
87 


467 


AF104415 


Mus musculus 


gene trap locus- 13 


6336 


91 


468 


U534 50 


Rattus 
norvegicus 


****** uaui^j. J-t.ai.XUll pi ULC 111 X 

JDP-1 


JL 3D 


49 


469 


AL031297 


Homo sapiens 


dJ97P20.1 (novel gene) 


3564 


99 


470 


AF257077 


Homo sapiens 


eukarvotic hran<?l ahinn 

v y w w *^ I- J- alio CI L* 1U11 

initiation factor EIF2B 
subunit 3 


1 O 7 4 


95 


471 


L28125 


Podospora 
anserina 


wclo uj.aiiciuuuiii ime pruLEin 


^ 84 


3 8 


472 


Y84903 


Homo sapiens 


A human proliferation and 
apoptosis related protein. 


2337 


100 


473 


AF144237 


Homo sapiens 


LOMP nrnt" pin 




44 


474 


Y71213 


Homo sapiens 


Human irritable bowel disease 

related nrtl vnpnhi Ho tvytq 


838 


ioo 


475 


Y95006 


Homo sapiens 


Hi 1 nt3 n c y*<» V » d vs >-/-» h a i ^ 

HUlLluil DC V» L. CS L. C3U IJiULclIl 

vel3 1, SEQ ID NO: 52. 


3411 


100 


476 


D38549 


Homo sapiens 


hal025 is new 


6533 


99 


477 


AF241230 


Homo sapiens 


TAK1- binding protein 2 


3656 


100 


478 


AL031534 


SCfii 7. osarrha 

romyces 

pombe 


(juLdLive a&pazragzne syncnase 


482 


40 


479 


L28125 


Podospora 
anserina 


uci - a *» *■ aij&uuciii" j.iite proLcin 


233 


26 


480 


AF161544 


Homo sapiens 


HSPC059 


434 


77 


481 


AJ23824 8 


Homo sapiens 




*J Qfl C 


99 


482 


Z38061 


Saccharomyce 


mal5, stal, len: 1367, CAI : 

0 1 AMVT4 VPAQT Dnn£zin 

G LUCOAM YLAS E SI (EC 3.2.1.3) 


295 


23 


483 


AF161381 


Homo sapiens 




1404 


100 


484 


AF223468 


Homo qani one 


huuz j. p roc ti in 


1314 


100 


486 


X57527 


Homo sapiens 


alpha l(VIII) collagen 


4166 


99 


487 


Y19062 


Homo nan'i o-n 


j-?is.j pi uLcin 


2475 


100 


438 


Y73373 




hikm clone 921803 protein 
sequence . 


555 


56 


489 


AL021918 1 


Homo 
sapiens 


b34I8.1 (Kruppel related Zinc 


4184 


100 


490 


X53773 


Rattus 
norvegicus 


938) 


4675 


97 


491 


U52426 


Homo sapiens 


GOK 


1459 


59 


492 


AL359773 


Leishmania 
maj or 


possible threonine synthase 


702 


45 


493 


AF226614 


Homo sapiens 


ferroportinl 


2929 


100 


494 


Z93241 


Homo sapiens 


viu j . x \ iiuvci proLcin 
with some similarity to 

1/1 WOw^llllCI JS. f\_ri jVJZjI v / 


513 


96 


495 


AF036977 


Homo sapiens 


unknown 


1812 


100 


496 


U93564 


Homo sapiens 




133 


45 


497 


Y91405 


Homo sapiens 


Human secreted protein 
acviuence encoaea cy gene 2 


357 


100 


498 


AF069781 


melanogaster 


Dciino-iijcc protein 


653 


43 


499 


Y16601 




Human -^r^rr* 1 ! 
^siiua^jxivjjjx. ULclll uuLIr « • 


1658 


98 


500 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


3883 


100 


501 


AF027503 


Mus 

musculus 


putative membrane -associated 
guanylate kinase 1 


205 


36 


502 


AF282874 


Homo sapiens 


nectin 3; PRR3 


2856 


99 


503 


AJ249732 


Homo sapiens 


G8 protein 


669 


100 


504 


AF20B861 


Homo sapiens 


BM-019 


1629 


100 


505 


L09708 


Homo sapiens 


complement component C2 


4022 


100 4 


507 


X66285 


Mus musculus 


HC1 ORF 


115 


43 


50B 


D00189 


Rattus 
norvegicus 


Na+ , K+-ATPase alpha- subunit 


5227 


99 
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TABLE 2 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


509 


Y94971 


Homo sapiens 


Human secreted protein clone 
fal71_l protein sequence SEQ 
ID NO;148; 


2176 


100 


510 


AB019038 


Homo sapiens 


beta- 1,4 mannosyltransf erase 


781 


77 


511 


AB019038 


Homo sapiens 


beta- 1,4 mannosyltransf erase 


1347 


100 


512 


AB019038 


Homo sapiens 


beta-1,4 mannosyltransterase 


1520 


99 


513 


X84908 


Homo sapiens 


phosphorylase kinase 


5729 


99 


514 


XS2851 


Homo sapiens 


peptidylprolyl isomerase 


650 


76 


515 


AF186084 


Homo 
sapiens 


epidermal growth factor 
repeat containing protein 


3046 


99 


516 


G03602 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7683. 


505 


99 


517 


U04706 


Bos taurus 


50 kDa protein 


1749 


77 


518 


G00653 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 4734. 


530 


100 


519 


AF161475 


Homo sapiens 


HSPC126 


1368 


100 


520 


Y993£^ 


Homo sapiens 


Human PR01475 (UNQ746) amino 
acid sequence SEQ ID NO: 88. 


3394 


97 


521 


AF266852 


Homo sapiens 


PTPLA 


1295 


100 


522 


AE000995 


Archaeoglobu 
s fulgidus 


chromosome segregation 
protein (smcl) 


153 


20 


523 


AF062249 


Homo sapiens 


immunoglobulin heavy chain 
variable region 


605 


97 


524 


AJ223830 


Rattus 
norvegicus 


ARE1 


2950 


y u 


525 


W01535 


Homo sapiens 


Cellular homologue of the 
SV40 large T antigen. 


1 0 1 c 


□ "3 

oJ 


526 


AF14 5658 


Drosophila 
melanogaster 


BcDNA . GH1022 9 


320 


33 


527 


AF112213 




protein 


D<54 


79 


528 


D49387 


Homo 
sapiens 


NADP dependent ieukotriene b4 
1 2 - h vdr ox vde h. vd iroa e na a p 




100 


529 


Y30819 


Homo sapiens 


Human secreted protein 
encoded from gene 9 . 


328 


32 


530 


AL079335 


Homo sapiens 


dJ132F21.3 (72.1 KDa protein 
(DKFZP564A03 2 , SBBI88) 
similar to mouse IFN-gamma 
induce MG11 . } 


1059 


99 


531 


Y9150* 


Homo sapiens 


Human secreted protein 
sequence encoded by gene 56 
SEQ ID NO: 179. 


1159 


98 


532 


X76116 


Caenorhabdit 
is elegans 


carrier protein (c2) 


576 


50 


533 


X76116 


Caenorhabdit 
is elegans 


carrier protein (c2) 


506 


50 


534 


X12966 


Homo sapiens 


3-oxoacyl-CoA thiolase 
propeptide (4 24 AA) 


1972 


100 


535 


Y09267 


Homo sapiens 


flavin- containing 
monooxygenase 2 


2486 


100 


536 


Z11773 


Homo sapiens 


SRU-ZBP 


2201 


99 


537 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


4741 


99 


538 


D84224 


Homo Bapiena 


methionyl tRNA synthetase 


3887 


99 


539 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


2933 


96 


540 


D84224 


Homo sapiens 


methionyl tRNA synthetase 


/C5Q 


Q Q 


541 


^03244 


Bos taurus 


H+ ATPase 31kDa subunit (EC 
3.6.1.3) 


848 


77 


542 


Y92514 


Homo sapiens 


Human OXRE-n. 


2301 


99 


543 


AF221712 | 


Homo 
sapiens 


Smad- and Olf -interacting 
zinc finger protein 


2151 


61 


544 


AE000919 


Methanobacte 
rium 

thermoautotr 
ophicum 


conserved protein 


207 | 


38 


545 


A06669 


synthetic 
construct 


preTGF-betal 


2070 


99 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


j DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


54 6 


Y02698 


Homo sapiens 


Human secreted protein 
encoded by gene 4 9 clone 
HTPCS60 . 


854 


98 


547 


AF112205 


Homo sapiens 


WSB-1 protein 


2275 


100 


54 8 


X60271 


Mus musculus 


c-rel 


2264 


74 


549 


AC016827 


Arabidopsis 
thaliana 


putative GTPase 


810 


42 


550 


Y70400 


Homo 
sapiens 


Human cell- signalling 
protein-2 . 


429 


68 


551 


AB048365 


Homo sapiens 


NEDD4-like ubiquitin ligase 1 


8290 


99 


552 


Y57880 


Homo sapiens 


Human transmembrane protein 
HTMPN-4 . 


1112 


95 


553 


AP119855 


Homo sapiens 


PR0184 7 


265 


67 


554 


M17236 


Homo sapiens 


MHC HLA -DQ alpha precursor 


1332 


100 ~ 


555 


AL078468 


Arabidopsis 
thaliana 


putative protein 


540 


40 


556 


AC006963 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 
(PlD:g4650844) 


515 


44 


557 


AK024487 


Homo sapiens 


FLJ00086 protein 


1623 


98 


558 


M12140 


Homo sapiens 


pol gene protein/ Xxx 


117 


48 


559 


W74825 


Homo sapiens 


Human secreted protein 
encoded by gene 97 clone 
HAQBF73 . 


225 


DO 


5G0 


X56S81 


Homo sapiens 


junD protein 


373 


88 


56X 


AF00313 6 


Caenorhabdit 
is elegans 


contains weak similarity to 
an AMP-binding motif 


2926 




562 


AL109839 


Homo sapiens 


dJ1069P2.3.1 (novel PABPC1 
(poly (A) -binding protein) 


877 


100 


563 


AF181640 


Drosophila 
melanogaster 


BcDNA . GHO 9817 


289 


42 


564 


AF052723 


Feline 

leukemia 

virus 


gag-pol precursor polyprotein 
gPr80 


154 7' " 


43 


565 


AF161472 


Homo sapiens 


HSPC123 


439 


44 


"566 


Y28817 


Homo sapiens 


pt326_4 secreted protein. 


3338 


100 


567 


U09848 


Homo sapiens 


zinc finger protein 


1738 


100 


569 


AF155113 


Homo sapiens 


NY-REN-55 antigen 


3603 


93 


570 


AF155113 


Homo sapiens 


NY-REN- 55 antigen 


3951 


99 


571 


AL032B21 


Homo sapiens 


dJ5SC23.1 (vanin 1) 


1821 


98 


572 


M69181 


Homo sapiens 


non -muscle myosin B 


7350 


99 


573 


M69181 


Homo sapiens 


non-muscle myosin B 


7311 


98 


574 


Y59678 


Homo sapiens 


Secreted protein 108-008-5-0- 
E6-FL. 


772 


100 


575 


AL365234 


Arabidopsis 
thaliana 


putative protein 


78B 


40 


576 


AL365234 


Arabidopsis 
thaliana 


putative protein 


788 


40 


577 


X06745 


Homo sapiens 


DNA polymerase alpha-subunit 
(AA 1 - 1462) 


7619 


99 


578 


AB041642 


Homo sapiens 


PAR- 6 


1342 


100 


579 


D86984 


Homo sapiens 


similar to yeast adenylate 
cyclase (S56776) 


2446 


100 


580 


AF165124"" 


Homo sapiens 


gamma-aminobutyric acid A 
receptor gamma 2 


2499 


QQ 

y v 


581 


W88812 


Homo sapiens 


Polvneotide fraament enrodpd 
by gene 58. 


2339 


y y 


582 


U82319 


Homo sapiens 


novel ORF 


342 


100 


583 


P92219 


Homo sapiens 
(human) 


CR1 protein. 


11425 


99 


584 


AJ223948 


Homo sapiens 


RNA helicase 


6608 


99 


585 


Y08612 


Homo sapiens 


88kDa nuclear pore complex 
protein 


3874 


99 


586 


Y42384 


Homo 
sapiens 


Amino acid sequence of 
Iv3l0 7. 


1007 


37 


587 


AF129756 


Homo sapiens 


BAT4 


1873 


98 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


588 


AF131775 


Homo sapiens 


Unknown 


1929 


99 


589 


AJ250865 


Homo sapiens 


TESS 2 


2348 


100 


591 


Z988B5 


Homo sapiens 


dJ522 J7 . 2 (bromodomain- 
containing l (similar to 
peregrin, BR140) ) 


4167 


100 


592 


L76571 


Homo sapiens 


nuclear hormone receptor 


1355 


100 


593 


AF091622 


Homo sapiens 


PHD finger protein 3 


9054 


100 


594 


X56807 


Homo sapiens 


desmocollin type 2a 


4443 


100 


595 


AL137802 


Homo sapiens 


dJ798A10.1 (novel protein) 


212 


55 


596 


AL022329 


Homo 
sapiens 


DK407F11.2 (adrenergic, beta, 
receptor kinase 2) 


3653 


100 


597 


AF226048 


Homo sapiens 


GL003 


2009 


99 


598 


AJ278112 


sapiens] 
>Y49635 
Y49635 21- 
OCT-1999 15- 
APR-1998 
Human sdp3 . 5 
protein. 
[Homo 
sapiens 


protein 




23 


599 


Y59741 


Homo sapiens 


Human normal ovarian tissue 
derived protein 10. 


1574 


99 


600 


L36S31 


Homo sapiens 


integrin alpha 8 subunit 


5386 


99 


601 


Y38458 


Homo sapiens 


Human secreted protein 
encoded by gene No. 20. 


895 


100 


602 


AF218584 


Homo sapiens 


GGA1 


3265 


100 


403 


Y13115 


Homo sapiens 


serine /threonine protein 
kinase 


5071 


99 


604 


AL132776 


Homo sapiens 


dJ393D12.1 (KIAA0776) 


2413 


QQ 

-* -* 


605 


AL034452 


Homo sapiens 


dJ682J15.1 (novel Collagen 
triple helix repeat 
containing protein) 


1979 




tte 


Y14494 


Homo sapiens 


aralarl 


3465 


99 


607 


AJ001981 


Homo sapiens 


0XA1L 


2603 


100 


608 


X86098 


Homo 
sapiens 


binds directly to adenovirus 
type 5 E1A protein 


30^9 


100 


610 


AF163 572 


Homo sapiens 


Forssman glycolipid 
synthetase 


1865 


99 


611 


AF161503 


Homo sapiens 


HSPC154 


1261 


97 


612 


L41834 


Ensis minor 


nuclear protein 


345 


30 


613 


Y91954 


Homo sapiens 


Human cytoskeleton associated 
protein 9 (CYSKP-9) . 


3668 


100 


614 


AL022327 


Homo sapiens 


dJ355C18.1 (KIAA0027) 


361 


94 


615 


X85786 


Homo sapiens 


binding regulatory factor 


3203 


100 


616 


Y08319 


Homo sapiens 


kinesin-2 


3487 


99 


617 


D12644 


Mus musculus 


KIF2 protein 


3 6 09 


97 


618 


U28789 


Mus musculus 


PACT 


5936 


89 


619 


Y35914 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
163. 


1684 


99 


620 


AB046382 


Mus musculus 


testis-abundant finger 
protein 


199 


23 


621 


Y00062 


Homo sapiens 


precursor polypeptide (AA -23 
to 1120) 


3440 


99 


622 


AF068286 


Homo sapiens 


HDCMD38P 


861 


100 


623 


X9824S 


Homo sapiens 


sortilin 


4436 


99 


624 


X61100 


Homo sapiens 


75 kDa subunit NADH 
dehydrogenase precursor 


3734 


99 


625 


S58544 


Homo sapiens 


75 kda infertility-related 
sperm protein 


2125 


99 


626 


AF151027 


Homo sapiens 


HSPC193 


582 


93 


627 


X14968 


Homo sapiens 


RH-alpha subunit (AA 1-404) 


2079 


100 


628 


Y50911 


Homo sapiens 


Human fetal brain cDNA clone 
vb7_l derived protein 


1983 


100 
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TABLE 2 



SEQ 
ID 
NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 

SCORh 


IDENTITY 


£29 


Y50911 




Unman fohal Wvt -i *■» nRMR nl *-*r-m 

nuiiidn letai orain cuw/a cione 


1694 


100 


630 


AF09B786 


Homo 


17 beta -hydroxy steroid 
ueiiyux. oyenase cype vn 


1754 


100 


631 


AL034555 


Homo 
sapiens 


di.7134019 1 ( ? i r\r> ^nnnov- 




10 0 


632 


W74826 


Homo sapiens 


Human aecretpd Drnbpln 
encoded bv aene 98 clone 
HAQBT94 . 


794 


Q C 

3 0 


633 


AF28B288 


Homo sapiens 


HPT protein 


2236 


.100 


£34 


AF041429 


Homo sapiens 


pRGRl 


823 


99 


635 


X66357 


Homo 9aDipna 


o»>~"i na / t" Vl f r~» n 1 np nrril* n 

OCi>^llC/ (.Ui-CUHIUC yiULClU 

kinase 




100 


636 


Y11284 


Homo sapiens 


AFX1 


2571 


98 


63 7 


AB004884 


Homo sapiens 


PKU-alnha 


3718 


Q Q 


638 


AJ0.02303 


Homo sapiens 


synaptogyrin lc 


1020 


100 


639 


AJ002304 


Homo sanif»n<5 




±uuz 


100 


640 


AJ002303 


Homo sapiens 


synaptogyrin lc 


933 


94 


641 


D8 7fi R? 


Wi"\ni(^ oar\i e»rio 
nuiiuj oapxclio 


similar to a C.elegans 
procein encoaea .in cosmia 
T26A5 . 


2676 


100 


642 


M146G0 






2473 


99 


643 


X06661 


Homo sapiens 


calbindin (AA 1-261) 


1358 


100 


644 


r\L jl J. j j \j \j 


UAmA eani ana 


f kuz a z z 


185 


76 


645 


AB031048 


Drosophila 
melanogaster 


microtubule associated- 
protein orbit 


738 


27 


646 


AF250842 


Drosophila 
melanogaster 


multiple asters 


834 


29 


647 


X86691 


Homo sapiens 


Mi-2 protein 


10110 


99 


648 


U67934 


Homo sapiens 


44.9 kDa protein C18B11 
homo log 


827 


96 


649 


AF236061 


Oryctolagus 
cuniculus 


RING-finger binding protein 


3830 


91 


650 


AL034553 


Homo sapiens 


dJ914P20.2 (KIAA0784 protein 
similar to Mus musculus 
activity- dependent 
neuroprotective protein 
(Adnp) ) 


5708 


100 


653 


X14766 


Homo flRn^ pnn 


otihnni t* 


2388 


99 


654 


AC004614 


Homo nan'i priR 


O i TTI "i lav H /"\ 45 _ 0 v^/~vyi ^ *! n n >*■ y-\ Tia 

0 J.111XJL0X iu i. - apcjnaxxi proceins 


... 
3 026 


99 


655 


Y57908 


Homo fiSDl nnR 


Human h ra n cmamh ya n 0 nmt-ai n 

HTMPN-32. 


b U 0 


99 


656 


Z34975 


Homo sapiens 


ldlCp 


J / 


100 


658 


ALO503O6 


Homo saDiens 


viu 7t f jo / . * iiiu vci piuLtsiri/ 


-L ^ ^ 




659 


W76734 


Homo 

sapiens i 


Human mDia Rho targeting 

TiT'O hpi n 

k/^ wUwtlll ■ 


781 


34 


660 


AF202724 


Homo sapiens 


Sadl unc-84 domain protein 1 


2172 


100 


661 


Z21966 


Homo sapiens 


mPOU homeobox protein 


1529 


100 


662 


nu ^ *± z> j j *» 


Mi l es mi i c pi 1 1 tie* 


dysf erlin 


4752 


59 


663 


AFI 8 2"? IS 


nuitlvJ fcictjJJLGllu 


myof erlin 


6232 


99 


665 


AL161516 


Arabidopsis 
thai i ana 


hypothetical protein 


209 


30 


667 




Homo sapiens 


valyl-tRNA synthetase 


3393 


99 


668 




Homo sapiens 


Amino acid sequence of 


3692 


100 


669 


AB010692 


Arabidopsis 
thaliana 


contains similarity to endo- 
beta-N-acetylglucosaminidase 
gene i 


611 


52 


671 


X56123 


Mus mus cuius 


talin 


4474 


76 


672 


AB039371 


Homo sapiens 


mitochondrial ABC transporter 
3 


2902 


99 


673' 


AF269223 


Homo sapiens 


TCP11 


806 


42 


674 


AF229633 


Mus musculus 


groucho-related protein 4 


4053 


99 j 


675 


L14463 


Rattus 


1 transducin 


3619 


92 
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SPECIES 




SCORE 


1 UtLVil 1 i I 






norvegicus 








676 


AC005757 


Homo sapiens 


R32611_l 


2779 


100 


677 


S61069 


Homo sapiens 


reverse t - rancirri nt"^<;p 

homolog=pol {retroviral 
element) 


OCT 
£, 


fit; 


678 


AF2 71388 


Homo sapiens 


fMD-W_aripf'vlnP1irSim'Tll P am' r4 
v»u-t j.h c\ i_ c: u _y incuJLaiiiXiiJ-L- d (_ X Li 

synthase 




i nn 

1UU 


679 


X79066 


Homo sapiens 


ERF-1 


1783 


100 


680 


AF11B566 


Miiq mil q mi 1 1 1 q 


i lCLiict u vj^tj icuiw ZiiiiL. linger 

protein 


/by 


50 


681 


Y51415 


Homo 


Human wild type pKe83 


2^21 


99 


682 


AL133545 


Homo sapiens 


bA386N14.1 (novel protein 
similar to a dual specificity 

t-\ Vi f~\ dt\V* a q « \ 

pjlUopnataac J 


700 


68 


683 


Y86214 


Homo sapiens 


Nuclear transport protein 
exone nco-sHi protein 


5888 


99 


684 


Y94 952 


Homo Qpni anq 

numu ijCijJiciiQ 


nuiuaii occrcteu proccin clone 
fhll6_ll protein sequence 
SEO ID NO -110 


•a C/i 


98 


685 


AL021878 


Homo sapiens 


dJ257320.4 (transcription 
factor 20 (AR1) (KIAA0292) 
(isof orm 2 ) ) 


154 " "" 


D / 


686 


AE000198 


Escherichia 
coli 


orf h vd nhhp t" "i ral nrnhpi n 


628 


inn 


687 


M58378 


Homo sapiens 


synapsin I 


3730 


99 


668 


AF039597 


Homo sapiens 


antigen NY-CO-31 


508 




689 


U09355 


Oryctolagus 
c un i cu 1 u s 


protein phosphatase 2A1 B 
gamma subunit 


2356 


99 


690 


AF155106 


Homo sapiens 


NY-REN- 3 6 antigen 


265 


50 


691 


AC004774 


Homo sapiens 


Dlx-5 


1542 


100 


692 


X90530 


Homo sapiens 


ragB 


1926 


99 


693 


X90530 




ragB 


1405 


99 


694 


X90530 


Homo sapiens 


ragB 


1590 


85 


695 


G01563 


Xiouio sapxcilb 


Human secreted protein, SEQ 
ID NO; 5644. 


330 


100 


696 


AC01181 0. 


T "i ana 


Putative methionine 

ami noi^o y*\ i r^s»«»o 
ctuixiiujpop L. lUd b e 


669 


52 


697 


AJ250425 


Rat t US 

norvegicus 


Pn 1 "1 wV> t oH n T 
vuij.yuj.ouxu x 


O A C C 


98 


698 


AB037901 


sapiens 


yciic amp x xx icu in squamous 
cell carcinoma- 1 




99 


6^9 


Y99401 


Homo sapiens 


Human /tMorfa^l amino 

acid sequence SEQ ID NO: 218. 


iJOO 


i nn 
1UU 


701 


AF221712 


Homo 
sapiens 


Cma — q n H OT f" — "i ti t" prn phi nn 

zinc finger protein 


O / U3 


i nn 
1UU 


702 


X83573 


Homo sapiens 


ARSE 


3184 


99 


703 


AJ243274 


Homo sapiens 


AP-2rep protein 


2078 


99 


704 


Y71262 


Homo sapiens 


Human chondromodulin-like 
protein, Zchml . 


1697 


94 


705 


Y71262 


Homo sapiens 


Human chondromodulin-like 
protein, Zchml. 


1736 


99 


706 


Y41257 


Homo sapiens 


Amino acid sequence of long 
human FAIM. 


1060 


100 


707 


AL022237 


Homo sapiens 


bK119lB2.3 (PUTATIVE novel 
nt-yi irdnsierase similar uo 
C. elegans C50D2.7) (isof orm 
1) ) 


2030 


100 


708 


AJ006266 


Homo sapiens 


AND-1 protein 


5942 


100 


709 


G01571 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5652. 


777 


99 


710 


Y08698 


Homo sapiens 


ranbp3 


2849 


98 


711 


Y68770 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
effector PHSP-2 . 


754 


99 
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r 

IDENTITY 


712 


U93574 


Homo sapiens 


putative pl50 


799 


59 


713 


AC004531 


Homo sapiens 


Gene with similaity to DEAD 
box helicases 


2715 


QQ 


714 


D89016 


Homo sapiens 


Neuroblastoma 


538 


48 


715 


Y92175 


Homo sapiens 


Human cardiovascular system 
associated protein tyrosine 
phosphatase 2. 


734 


98 


716 


AL137013 


Homo sapiens 


DA311P8.3 (probable uracil 
phosphoribosyl t ranf erase ) 


862 


100 


111 


AB035123 


Mus musculus 


GDI alpha/GTla alpha/GQlb 


1696 


93 


71B 


Y96290 


Homo >P40254 
P40254 25- 
OCT-1984 09- 
APR-1983 
Human IgD. 
[Homo 
sapiens 


Human TfJFAM-9 i mrm in rirrl nKnl -i -n 




85 


719 


X07979 


Homo sapiens 


integrin beta 1 subunit 
precursor 


4347 


99 


720 


AJ224819 


Homo sapiens 


tumor suppressor 


2149 


99 


721 


Y07595 


Homo sapiens 


transcription factor TFIIH 


2373 


100 


722 


W4 1565 


Homo 
sapiens j 
>W41564 
W41564 08- 
OCT-1997 05- 
APR-1996 

Human 

calpain. 

[Homo 

sapiens 


Human calpain . 


1591 


99 


723 


AF161341 


Homo sapiens 


HSPC078 " 


1097 


98 


724 


AF187318 


Homo sapiens 


F-box protein Fbx2 


1607 


100 


725 


AC006708 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 

mDM& crO "i r* "i nrr »»"*^+» a 4 n ODD! 1 

ULKiMM bpiltiny ptOUcin FKr JJ. 


1143 


46 


726 


AC006708 


Caenorhabdit 
is elegans 


contains simlarity to 
Saccharomyces cerevisiae pre- 
uiiVLNrt opi j. o jl i ly protein ft\.fjx 
(G3:Z72876) 


988 


46 


727 


AC024 8 18 


Oa & n ot"Vi a V"*H i I* 

is elegans 


contains similarity to p£aro 
family PF00400 (WD domain, 
G-beha reneat" i r 001*0 = P 1 n 
E-1.4e-20, N*=3 


950 


44 


728 


AJ005897 


Homo sapiens 


JM5 




47 


729 


Y45377 


Homo sapiens 


Human secreted protein 
27. 


908 




730 


G03931 


Homo sapiens 


Human nprrphpri n p> i n c pn 

ID NO: 8012. 


3/0 


100 


731 


AB012720 


Oncorhynchus 
ma sou 


GTP-binding protein 


3865 


76 


732 


W73404 


Homo sapiens 


Human secreted protein 
encoded by Gene No . 8 . 


862 


97 




G02650 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6 731. 


644 


97 


73 4 


AC024 813 


Caenorhabdit 
is elegans 


Hypothetical protein 
Y54F10AL.a 


152 


24 


735 * 


AL035461 


Homo sapiens 


dJ967N21.6 (novel CDP-alcohol 
phosphatidyltransf erase 1 
family member protein) 


1562 


98 


736 


U00033 


Caenorhabdit 
is elegans 


similar to S. cerevisiae YJU2 
protein 


605 


41 


737 


AF07909B 


Homo 
sapiens 


arginine- tRNA-protein 
transferase 1-lp; ATEl-lp 


2733 


99 



161 
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TABLE 2 



SEQ 
ID 
ran . 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


PMTTTI 

SMITH- 
WATERMAN 

CPHDP 


% 

IDENTITY 


73 8 


/\U 1 J 1 1 JL £ 




nucleoids Klift'ltcXJlCaoa 


^ / J? j 


i nn 
1U u 


739 


AJ133115 


Homo sapiens 


TSC-22-like protein 


2054 


99 


n a n 




Homo sapiens 


M-phase phosphoprotein 9 


953 


100 


741 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


564 


74 




us /iyi 


Caenorhauui t 
is elegans 


strong similarity to tne YPil 
sub- family of RAS proteins 


960 


65 


743 


X76057 


Homo sapiens 


phosphomannose isomerase 


2191 


100 


744 


G03209 


Homo sapiens 


Human secreted protein, SEQ 
xu wu : / z y U . 


496 


98 


74 5 


X97Q54 


Homo sapiens 


Sec23 protein 


4034 


99 


746 


W93946 


Homo sapiens 


Human regulatory molecule 
HRM-2 protein. 


994 


100 


747 


Y733B8 


Homo sapiens 


HTRM clone 3376404 protein 
sequence . 


1565 


99 


748 


M19529 


Sus scrofa 


follistatin A 


1906 


98 


749 


AJ249457 


Trichomonas 
vaginalis 


centrin, putative 


183 


28 


750 


AC004410 


Homo sapiens 


fos39554_l 


2094 


100 


751 


AF074968 


Homo sapiens 


p47ING3 protein 


2167 


100 


752 


AF252284 


Homo sapiens 


transcription specificity 
factor Spl 


4005 


100 


753 


AB049629 


Homo sapiens 


phospholysine 

phosphohistidine inorganic 
pyrophosphate phosphatase 


1375 


99 


/d4 


u / jmuo 


Homo s ap i en s 


ribosomal protein L39 


160 


77 


755 


AB008430 


Homo sapiens 


CDEP ' 


142 


29 


758 


L32162 


Homo sapiens 


transcription factor 


574 


80 ; 


759 


AF037204 


Homo sapiens 


RING zinc finger protein 


295 


54 


760 


Y44250 


Homo 
sapiens 


Human cell signalling 
protein- 13. 


625 


100 


761 


AF218586 


Homo sapiens 


Cide-b 


1136 


100 


762 


U38934 


Gallus 
gallus 


histone H2A 


625 


97 


763 


AF226053 


Homo sapiens 


HSKM-B 


606 


32 


764 


X13403 


Homo sapiens 


Oct-1 protein (AA 1 - 743) 


3626 


100 


765 


D87446 


Homo sapiens 


Similar to a C. elegans 
protein encoded in cosmid 
C27F2 (CJ40419) 


568 


38 


766 


AL023826 


Caenorhabdit 
is elegans 


Y17G7B. 14 


200 


27 


767 


Y82777 


Homo sapiens 


Human chordin related protein 
(Clone dw665_4) . 


2551 


99 


768 


X92475 


Homo sapiens 


ITBA1 


1429 


100 


769 


Y42752 


Homo sapiens 


Human calcium binding protein 
3 (CaBP-3) . 


1426 


100 


1 /u 


X514 16 


Homo sapiens 


hormone receptor hERRl (AA 1- 
521) 


2641 


97 


771 


AJ 006591 


Homo sapiens 


cysteine-rich protein 


1793 


100 


772 


A08695 


Homo sapiens 


rap 2 


935 


100 




Z12173 


Homo sapiens 


N-acetylglucosamine- 6 - 
sulphatase 


2970 


100 


774 


Y91950 


Homo sapiens 


Human cytoskeleton associated 
protein 5 (CYSKP-5) . 


565 


43 


77b 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger) 


B55 


56 


111 


AL023799 


Homo sapiens 


dJ322P7.1 (zinc finger) 


855 


56 


778 


G01880 


Homo sapiens 


Human secreted protein, SEQ 
ID NO : 5961 . 


B49 


98 


779 


AJ012590 


Homo sapiens 


glucose 1- dehydrogenase 


4155 


99 


780 


AL078S82 


Homo sapiens 


dJ130E4.2 (KIAA0796) 


1321 


68 


781 


Z75955 


Caenorhabdit 
is elegans 


similar to mitochondrial 
carrier protein 


384 


34 


782 


AL109965 


Homo 
sapiens 


dJ1121G12.2 (SCAN domain- 
containing 1 protein) 


900 


100 


783 


AF061262 


Mus 

musculus 


semaF cytoplasmic domain 
associated protein 2 


1316 


83 


784 


G03873 


Homo sapiens 


Human secreted protein, SEQ 


649 


95 
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SMITH- 
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SCORE 


% 

IDENTITY 








ID NO: 7954. 






785 


Y84441 


Homo sapiens 


Amino acid sequence of a 
human RNA- associated 
protein . 


2074 


100 


786 


Y00918 


Homo sapiens 


Human Rab protein, RABP-1, 
protein sequence. 


1048 


99 


tot 
/Of 




Homo sapiens 


ribonuclease HI large subunit 


1548 


99 


788 


AB035384 


Homo sapiens 


SRp25 nuclear protein 


962 


94 


789 


AF024631 


Homo sapiens 


ANG2 


2644 


100 


790 


AJ006710 


Rat t us 
norvegicus 


phosphatidylinositol 3 -kinase 


450G 


97 


792 


V00638 


bacteriophag 
e lambda 


reading frame ealO 


600 


100 


793 


AF049103 


Homo sapiens 


Huntingtin interacting 
protein 


819 


100 


795 


Z26317 


Homo sapiens 


desmoglein 2 


4810 


99 


796 


Y76884 


Homo sapiens 


Retinoblastoma binding 
protein-7sequence . 


5080 


99 


797 


U15155 


Gallus 
gallus 


trypsinogen 


372 


37 


79B 


U971B9 


Caenorhabdit 
is elegans 


strong similarity to thw 
P13/P14 family of kinases 


227 


28 


799 


AF112201 


•Homo sapiens 


neuronal protein NP25 


1053 


100 


800 


AF234765 


Rattus 
norvegicus 


serine-arginine-rich splicing 
regulatory protein SRRP86 


958 


63 


801 


AF267852 


Homo sapiens 


placental protein 13-like 
protein 


743 


99 


802 


AF208851 


Homo sapiens 


BM-009 


766 


80 


803 


Z81097 


Caenorhabdit 
is elegans 


Similarity to Human 
retinoblastoma -binding 
protein RBAP46 yk662dl2.5 
comes from this gene 


152 


27 


804 


G02113 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6194 . 


496 


98 


805 


AL121673 


Homo sapiens 


DA305P22.1 (hovel protein) 


1160 


100 


806 


AC013 4 83 


Arabidopsie 
thaliana 


putative GTPase activator 
protein 


264 


30 


807 


AC013483 


Arabidopsis- 
thaliana 


putative GTPase activator 
protein 


264 


30 


808 


AB013885 


Homo sapiens 


beta-ureidopropionase 


1494 


100 


809 


AF078842 


Homo sapiens 


HOTTL protein 


1581 


99 


810 


AF161421 


Homo sapiens 


HSPC303 


2134 


96 


811 


AF261689 


Homo sapiens 


DNA polymerase epsilon pl7 
subunit 


734 


100 


812 


Z74029 


Caenorhabdit 
is elegans 


Similarity to C. elegans 
alcohol dehydrogenase comes 
from this gene 


610 


71 


813 


Z73497 


Homo sapiens 


CU240C2.2 (Core histone 
H2A/H2B/H3/H4) 


324 


100 


814 


W87689 


Homo 
sapiens 


Human HTXFT19 polypeptide. 


1484 


99 


815 


X16282 


Homo 
sapiens 


zinc finger protein (217 AA) 
(1 is 2nd base in codon) 


1109 


99 


816 


Z92539 


Mycobacteriu 
m 

tuberculosis 


pth 


300 


36 


R 1 ft 




Mus musculus 


B9 


197 


27 


B19 


AL117555 


Homo sapiens 


hypothetical protein 


321 


94 


820 


AC005328 


Homo sapiens 


R26660_2, partial CDS 


865 


97 


821 


G03951 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8032. 


700 




822 


L34807 


Musca 
domestica 


transposase 


174 


20 


823 


G02928 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7009. 


558 


78 


624 


Z99531 


Schizosaccha 


caffeine- induced death 


184 


29 
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ID 

NO : 


halloo lUW 

NUMBER 




DESCRI PTION 


SMITH- 
WATERMAN 

OOTM3 0 

bUUKfc. 


% 

IDENTITY 






romyces 
pombe 


protein 1 






825 




Homo sapiens 


ultra high sulfer keratin 


693 


68 


826 


U23037 


Oryctolagrus 

cunicuiUo 


eIF-2Bepsilon 


3406 


90 


827 


G03 412 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7493, 


464 


100 


828 


I J u O z / 


Homo sapiens 


Human secreted protein 
encoded from gene 17. 


113 


44 


a £ y 


V3 Ol DO 


Homo sapiens 


Human receptor molecule (REC) 
encoded by Incyte clone 

n n •> n •> -7 q 


1012 


100 


830 


W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33 . 


1264 


99 


ojZ 


AJ3Ullb4<£ 


Homo sapiens 


MEGF9 


2097 


100 


833 


G02639 


Homo sapiens 


Human secreted protein, SEQ 
ID JNU : b Id U . 


223 


70 


v Jl 




, 

Homo sapiens 


transcriptional regulator 
protein nv.,iNvjrF 


1574 


100 


835 


AF119664 


HrtmO Q a T\ i pnq 


tiaUdLi l^tiuiidi reg uia Lur 
protein HCNGP 




0Q 

89 


836 


AF119664 




tionucripLionai regulator 
protein HCNGP 


1448 


94 


837 


X12517 


Homo R^ni pnn 




918 


100 


838 


U32865 


Drosophila 

mf>l arinrfaHhPr 

IIICXCtllUM dOLCl 


linotte protein 


164 


24 


63^ 


AF067730 


Homo sapiens 


TLS-associated protein TASR-2 


631 


56 


840 


U27B31 


Homo sapiens 


striatum-enriched phosphatase 


2840 


98 




AF2 86366 


Homo sapiens 


CamKI-like protein kinase 


1796 


100 


842 


O A 1 A A 

G02309 


Homo sapiens 


Human secreted protein, SEQ 
ID NO : 6390 . 


278 


98 




AtiUO j bib 


Drosophila 
melanogas ter 


ade3 gene product 


113 


48 


844 


G0135O 


Homo sapiens 


Human secreted protein, SEQ 
ID JNO : 5431 . 


629 


100 


845' 


UZ iOjO 


l*Ai l a mi infill n n 

nus mus cuius 


glycosyl -phosphatidyl- 
inositol-anchored protein 

IlvJll IkJ -L Um 


3305 


96 


847 


Y87788 


Homo sapiens 


Human RBP-26 protein. 


2026 


100 


848 


API 


Homo sapiens 


Diff33 protein homoXog 


2398 


100 


849 


U41315 


Homo sapiens 


ZNF127-Xp 


2458 


93 


850 


API 97 7R/i 


Homo sapiens 


makorin 1 


2062 


97 


851 




Homo s ap i e n s 


Protein regulating gene 
expression PRGB-21 . 


1548 


100 


852 


Z22968 


Homo sapiens 


M130 antigen 


6205 


100 


853 


"7*5 *5 Q T 1 


Homo sapiens 


M13 0 antigen extracellular 
variant 


6380 


100 


854 


G033*2 


Homo sapiens 


Human eecreted protein, SEQ 

T n MO . T A A O 

Xu NO : 74 43 . 


330 


96 


855 


G03362 


Homo sapiens 


Human secreted protein, SEQ 

ID NO : 74 43. 


203 


100 


856 


nf 403110 




Homo sapiens 


CG 1-203 


452 


100 


03/ 


ALvJUoUby 


Arabidopsis 
tnaliona 


putative cleavage and 
polyadenylation specifity 
factor 


1383 


55 


858 


AL021546 


Homo sapiens 


Cytochrome C Oxidase 
Polypeptide Vla-liver 
precursor (EC 1.9.3.1) 


593 


100 


859 


L02956 


Xenopus 
laevis 


ribonucleoprotein 


1664 


85 


860 


AF201947 


Homo sapiens 


MEK binding partner 1 


616 


100 


861 


L31783 


Mus musculus 


uridine kinase 


1266 


92 


862 


AF161472 


Homo sapiens 


HSPC123 


602 


73 


863 


Z49068 


Caenorhabdit 
is elegans 


mitochondrial carrier protein 


370 


43 


864 


AF154108 


Homo sapiens 


tumor necrosis factor type 1 


3559 


99 
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SMITH- 
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% 

IDENTITY 








receptor associated protein 






865 


AE001530 


nvl ori .TQQ 




O-jn 

fi JU 


32 


866 


X57807 


Homo saoipris 


"i mmiinocrl ohn Tin T amhHa 1 ■> rrhh 

chain 




91 


867 


AL03l£73 


Homo sapiens 


dJ694B14.1 (PUTATIVE novel 
KRAB box protein with 18 C2H2 


4066 


99 


868 


Y11652 








100 


869 


AF192968 


Homo sapiens 


high-glucose- regulated 
protein 8 


3041 


99 


870 


AB0PO64 8 


Hnmn qani pna 


VT7\ AD Ad 1 nmt-Pin 
X\JLx\rtU O *i X piTO LClIl 


3237 


99 


871 


AL031427 


Homo sapiens 


dJ167A19.1 (novel protein) 


1608 


100 


872 




xlUluO oaVXGnt} 


core histone macroH2A2.2 


1866 


100 


873 


AL021331 


Homo sapiens 


dJ366N23.1 (putative C. 
elegans UNC-93 (protein 1, 
u^br *i . i/ jjiAt. protein/ 


1129 


100 


874 


All QUO 


numu odpicilo 


propionyl-CoA carboxylase 


3579 


100 


875 


AL117334 


Homo Rflnl pnQ 


uuoo / r xj. . i iiiovci p roue in 

(part of translation of cDNA 

DKFZn414MDfi1 Pm • RT 1 T fi9i o) 1 
uivru^t j. ( mil ; j\x»x XU y ) ) 


306 


100 


bit 


X79489 


Saccharomyce 
s cerevisiae 


E-925 protein 


446 


35 


877 




Ur\mr\ nan^ ana 

rioiuo sapiens 


Human secreted protein clone 
dn834__l protein sequence SEQ 
ID NO: 8. 


811 


100 


878 




Homo sapiens 


OUMDT C 
LrifliV J. . D 


957 


100 


879 


a / y*t x t 


Sus scrofa 


40S ribosomal protein S12 


687 


100 


880 




Saccharomyce 
s cerevisiae 


Soilp 


478 


28 


8 81 


Y87275 


Homo sapiens 


Human signal peptide 
containing protein HSPP-52 

O TO MO . CO 


2547 


100 ; 


882 


M14036 


Wrtm/""v tj ;"i i ana 

riL/ino BapiclJo 


l.i - lnniDitor 


598 


77 


883 


AB041261 


Homo sapiens 


calcium- independent 
phospholipase A2 


2903 


100 


884 


AF020313 




proline-rich protein 48 


999 


84 


885 


Y10936 


Homo sapiens 


hypothetical protein 


1104 


99 


886 


AF073997 


Mus musculus 


myotubularin related protein 
1 


866 


36 


887 


Y57893 


Homo sapiens 


Human transmembrane protein 
HTMPN-17. 


1099 


94 


888 


AL.11 7&"\K 


Homo sapiens 


hypothetical protein 


929 


99 


889 


AF210317 


Homo sapiens 


facilitative glucose 
transporter family member 

flT.ITVQ 
\3XiU X J 


2046 


99 


890 


Y36031 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


583 


100 


891 




Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
416. 


192 


57 


892 


AF237631 


Homo sapiens 


ubiquitous tropomodulin U- 
Tmod 


1798 


100 


893 


AF090929 


Homo sapiens 


PR00477p 


653 


99 


894 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD40 protein 
BiNtj4 isimiiar to s. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F28D1.1) 


3196 


100 


895 


AL031228 


Homo sapiens 


dJ1033B10.2 (WD4 0 protein 
BING4 (similar to S. 
cerevisiae YER082C, M. sexta 
MNG10 and C. elegans F2 8D1.1) 1 


2825 


96 


896 


AF171102 


Homo sapiens 


retinal degeneration B beta 


1302 


95 


897 


AE003551 


Drosophila 
melanogaster 


CG18176 gene product 


633 


33" 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 


838 


AJ237946 


Homo sapiens 


DEAD Box Protein 5 


2443 


100 


899 


Z97184 


Homo sapiens 


HKE2 


624 


100 


900 


Z97184 


Homo sapiens 


HKE2 


409 


98 


901 


AJ245587 


Homo sapiens 


Kruppel-type zinc finger 


1942 


100 


902 


AF091034 


Homo sapiens 


GTP-binding protein RAB22A 


1011 


100 


903 


R95953 


Homo sapiens 


Eukaryotic cell growth 
inhibiting factor. 


414 


96 


904 


L04733 


Homo sapiens 


kinesin light chain 


1936 


72 I 


905 


AE003540 


Drosophila 
melanogaster 


CG10984 gene product 


446 


3 3 


90£ 


M55542 


Homo sapiens 


guanylate binding protein 
isoform I 


2993 


QP 


907 


M55542 


Homo sapiens 


auanvlate bindinci nrnfpi n 

isoform I 




Q C 


908 


W8408S 


Homo sapiens 


Human membrane fusion protein 
WDProl . 


XO07 


100 


909 


AF168676 


Homo 
sapiens 


TNF intracellular domain- 
interacting protein 


647 


100 


910 


AB029150 


Homo sapiens 


KRAB zinc finger protein 
HFB101L 


2196 


100 


911 


G02871 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6952. 


521 


100 


912 


G03162 


Homo sapiens 


Human secreted protein, SEQ 


387 


87 


913 


AJ243721 


Homo 

sapiens] 

>Y92508 

VQO c n o 1 o 

APR-2000 06- 

Human OXRE- 
5 . [Homo 


dTDP-4-keto-6-deoxy-D-glucose 
4-reductase 


1710 


100 


914 


U24189 


Caenorhabdit 

iq plpoanc 


hypothetical protein 1207-1; 
Method: conceptual 
translation supplied by 


244 

i 


41 


915 


Y02591 


Homo D S pno 


A human progesterone receptor 
complex p23-like protein. 


843 


99 


916 


AE000984 


Archaeoglobu 
3 fulgidus 


dinitrogenase reductase 
achivat* incr ol vrfthvrfrrtl aao 

e» %» ^ * c» o Any y j. y v» viijrwii v4. aoc 

(draG) 


171 


26 


91B 


M23159 


Cricetus 
cricetus 


DHFR-coamnl i f i pd nrnh^in 




3 0 


919 


L12018 


Caenorhabdi t 
is elegans 


rsufcat 4 vp 

m w t* u x v c 




41 


920 


AF102177 


Homo sapiens 


tumor antigen SLP-8p 


i "yen 


97 


921 


AL096712 


Homo sat> i ens 


diT74 4T?4 ? (qimilnr \-r* a 

UU / 1 4. a t . t, \ OXIIIXXuX L. \J Cl 

novel human gene mapping to 
Activa t*oy"} 


n ni n 

JLUJ. f 


78 


922 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


866 


42 


923 


AL161495 


Arabidopsis 
thaliana 


putative WD-repeat protein 


442 


36 


924 


U97001 


Caenorhabdit 
Is elegans 


similar to 

S chi zosaccharomyces pombe 


605 


51 


925 


X71978 


Mus musculus 


Fif 


1503 


95 


926 


M92288 


Drosophila 
melanogaster 


beta-spectrin 


290 


51 


927 


Y27575 


Homo sapiens 


Human secreted protein 
encoded by gene No . 9 . 


1392 


100 


928 


Y22499 


Homo sapiens 


Human secreted protein 
sequence clone mh703_l . 


2249 


100 


930 


AJ224326 


Homo sapiens 


ribulose-5-phosphate- 
epimerase 


912 


100 


931 


U28991 


Caenorhabdit 


coded for by C. elegans cDNA 


660 


55 
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TABLE 2 
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SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






is elegans 


cm21c7 






932 


AL080065 


Homo sapiens 


hypothetical protein 


210 




933 


G01B84 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5965. 


767 


98 


934 


AJ276485 


Homo sapiens 


integral membrane transporter 
protein 


1200 


inn 


935 


AL035681 


Homo sapiens 


dJ756G23.3 (novel protein 
similar to drosophila 
transcriptional repressor) 


1142 


80 


93$ 


AB026808 


Mus muscplus 


synaptotagmin XI 


2142 


95 


937 


AB015345 


Homo sapiens 


HRIHFB2216 


2601 


99 


938 


X65724 


Homo sapiens 


0RF2 


498 


100 


939 


W89024 


Homo sapiens 


Po 1 vnpnt" i Hp f yanmpn) - AnrnHo^ 
jfr r <l Jt ciyiiiciiu encoucQ 

by gene 156 . 


11 O / 


10 0 


940 


G04047 


Homo sapiens 


Human secreted protein, SEQ 
ID NO : 8128 . 


117 


100 


941 


AF094583 


Homo sapiens 


putative HIV-1 infection 
related orohpin 


452 


100 


942 


AC024200 


Caenorhabdit 
is elegans 


contains similarity to 
several zinc finger proteins 
but not to the zinc finger 
domains 


J jU 


C Q 

o9 


943 


AF129756 


Homo sapiens 


G5c 


273 


100 


944 


M23765 


Rattus 
norvegicus 


alpha- tropomyosin 


133 


96 


945 


AC009917 


Arabidopsis 
t ha liana 


Contains similarity to 


583 


47 


946 


AF223468 


Homo sapiens 


AD021 protein 


551 




947 


AF0I55473 


Homo sapiens 


GAGE -8 


273 


51 


948 


X75756 


Homo sapiens 


protein kinase C mu 


2019 


68 


949 


AF143956 


Mus musculus 


coronin-2 


2300 


93 


950 


Y36729 


Hnmn 

sapiens 


nuiiictn flix protein sequence . 


1861 


99 


951 


W49041 


Unino cm n i p> r» c 


Municin low oensicy .lipoprotein 

hi nfii nor nrnl-Ain r,RD_ o 


282 


67 


952 


AB016S81 


Arabidopsis 
thaliana 


aene iri-MXPn 7~ 

yc.ic^iu • ri<vv_± 1,1" 


Zltj 


46 


953 


Y01785 


Homo sapiens 


Hi lmRn n hi mi i H n - c (~\t\ -i i ma i r\n 

enzyme >Y25341 Y25341 01-JUL- 
1999 12-AUG-199R Human 

protein. 




100 


954 


AF145615 


Drosophila 
melanogaster 


BCDNA.GH03377 


8 23 


u b 


955 


U09410 


Homo sapiens 


zinc finger protein ZNF131 


2483 


99 


956 


U09410 \ 


Homo sapiens 


zinc fincrer nrotpin 7NF111 




Q Q 


9*1 


AF195623 


Homo sapiens 


cholineDhosDhntransf era hp 1 

alpha 


2126 


O Q 
J? 


958 


X94917 


Drosophila 
melanogaster 


head-elevated expression in 
0.9 kb 


155 


32 


959 


U54B07 


Rattus 
norvegicus 


GTP-binding protein 


1167 


97 


960 


AF058807 


Bos taurus 


GTP -binding protein rah 


606 


97 


961 


G03244 


Homo sapiens 


Human p;pr ,, »~^t'p*H nrnf pi n cpn 
ID NO: 7325. 




loo 


962 


AF078850 i 


Homo fiatri e^m a 


steroid dehydrogenase homolog 


583 


40 


963' ' 


AP001754 


Homo sapiens 


transient receptor potential- 
related channel 7, a novel 
putative Ca2+ channel protein 


317 


30 


964 


AL035419 


Homo sapiens 


dJ1100H13.1 (putative novel 
protein) 


1129 


100 


965 


X^1381 


Rattus 
rattus 


interferon- induced protein 


202 


46 


966 


D3B169 


Homo 
sapiens 


inositol 1, 4 , 5-trisphosphate 
3-kinase isoenzyme 


3278 


100 


967 


AL031432 


Homo " 
sapiens 


dJ465N24.2.1 ( PUTATIVE novel 
protein) (ieoform 1} 


893 


100 
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TABLE 2 



SEQ 
ID 
NO : 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 


t 

IDENTITY 


968 


U79275 




imlf n <^ wn 


Oil 


t nn 

1UU 


969 


AJ011306 


Homo 
sapiens 


guanine nucleotide exchange 
factor (long isoform) 


2752 


99 


970 


AF281134 


Homo sapiens 


pvnR njnp rT">mr">or*»p*nt~ R fin A ft 


HOD 


t nn 
1U u 


971 


U53336 


Caenorhabdit 
is eleaana 


weak similarity over a short 

reaion Y r> mvfiq i n hpaw rh^^ n 

■i-wyAWiA i— i— ^ Miyuaj.ii ncavy v_ i ici -L 11 


536 


23 


972 


AC018749 


JJw >A- LJ U 1 1 1 114. a 

major 


L8840 12 


coq 
003 


do 


973 


AF188504 


Mus mus cuius 


LNV 


544 


S3 


974 


U25801 


Homo sapiens 


Taxi ViinHincr n-rrit* pi n 


ft^9 


Qft 


975 


AF049523 


Homo sapiens 
1 


hunt ingt in- interacting 
orotein HYPA/FBP11 


1390 


97 


976 


AF161530 


Homo sa"D^ pnc 


HSPC182 


i ruin 


1UU 


977 


G04020 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8101. 


626 


100 


978 


API fid 7Q7 

JL. D *i #7 / 


TJ /~\TT\/~\ aan^ 

rtoiuo sapiens 


ribosomal protein L17 isolog 


908 


100 


979 


U94991 


Xenopus 
laevis 


transcription factor XLM01 


795 


97 


980 


S73775 


Homo sapiens 


calmitine; calsequestrine 


2029 


100 


301 


VQ4 ft Q ft 
I3iOOO 


sapiens 


riuman protein clone HPUi4b2. 


2501 


100 


982 


AJ243191 


Homo sapiens 


heat shock protein 


827 


96 


983. 


X65020 


Bos taurus 


PSST subunit of the NADH: 
ubiquinone oxidoreductase 
complex 


964 


85 


984 


AJ249207 


Rhodococcus 
sp . AD4 5 


putative racemase 


351 


43 


985 


Z30093 


Homo sapiens 


basic transcription factor 2, 
35 kD subunit 


1576 


99 


y o fa 


AbO J 0 o J b 


Homo sapiens 


contains two glutamine rich 
domains, three zinc- finger 
domains t and matrin 3 
homologous domain 3 <MH3) 


4697 


99 


987 


AF227258 


Bos taurus 


RPGR-interacting protein-1 


1262 


38 


J 0 0 




Homo sapiens 


aul042K10.2 (supported by 
GENS CAN, FGENES and GENEWISE) 


4 048 


99 


989 


AT, 00931 ft 




ajiu^zKiu ,^ {supported oy 
vjoinow-um , ronn£iC3 ana vjc«rvCiWlo£» / 


2321 


99 


990 


API fi 1 49 K 


nomo sapiens 


UQDP1 C\ Q 


448 


92 


991 


AF161426 


Homo sapiens 


HSPC30 8 


448 


92 




MJ? 101140 


Homo sapiens 




453 


92 


"993 


AL023859 


Schizosaccha 

romyces 

pombe 


trna- splicing endonuclease 
subuni t 


172 


42 


994 




numo sapiens 


ajoioM9.i inovei Homeooox 
domain protein) 


241 


4 7 


995 


AC005253 


Homo sapiens 


R26445JL 


902 


100 


996 


AF265206 


Homo sapiens 


M0G1 isoform A 


974 


100 


QOT 

/ 




Pyxococcus 
abyssi 


sarcosine oxidase, subunit 
beta (soxB) 


195 


28 


998 


ArjUO J 64 1 


Drosophila 
melanogas t er 


BG:DS00941.3 gene product 


218 


58 


999 


W69343 


Homo 
sapiens 


Secreted protein of clone 
CR930_1 . 


1340 


98 


1000 


AY007135 


Homo sapiens 


similar to bovine ADP/ATP 
translocase Tl mRNA with 
GenBank Accession Number 
M24102.1 


1543 


100 


1001 


Y73381 


Homo sapiens 


HTRM clone 1877278 protein 
sequence . 


1668 


100 


1002 


AF208844 


Homo sapiens 


BM-002 


428 


100 


1003 


AE004944 


Pseudomonas 
aeruginosa 


hypothetical protein 


134 


35 


1004 


AL031431 


Homo sapiens 


dJ462023.2 (novel protein) 


2058 


100 


1005 


S45367 


Can is 
familiaris 


centractin 


1949 


100 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

WATPPM1W 

SCORE 


% 


1006 


S45367 


Can is 

f amiliaris 


centractin 


1315 


Oft 


1007 


AB022158 


Mus 

musculus 


chaperonin containing TCP-l 
epsilon subunit 


2649 


96 


1008 


Y76332 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 38. 


1282 


97 


1009 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
protein 


1671 


58 


1010 


Z68218 


Caenorhabdit 
is elegans 


K01H12.1 


269 


67 


1011 


AB011414 


Homo sapiens 


Kruppel-type zinc finger 
Drotein 


1671 


58 


1012 


Z14000 


Homo sapiens 


RING1 ' 


2017 


100 


1013 


G02841 


Homo sanipns 


Human opp>»p)'pH nrn hp'i n QPPi 
nu.iua.ii DCLiCLCU JJIULCXU^ Z>E*\J 

ID NO; 6922. 


•Jin 
O J Z 


Q 1 " 


1014 


AF14^659 " 


Drosoohila 
melanogaster 


BcDNA RH1 dill 


1 T A A 


c o 


1015 


Y02860 


Homo saDiens 


Fl*<5crm#» n r* nf human c r» <--3 

Drotein encoded bv apnp fiS 


CC/ 


o / 


1016 


Y02591 


Homo sapiens 


A human progesterone receptor 

Comdex D23-like nrotein 


772 


97 


1017 


Y99448 


Homo sapiens 


Human PR01759 (UNQ832) amino 
acid Rpnupnrp <5F!0 TD MO • 174 


2323 


100 


1018 


X67250 


Rattus 
norvegicus 


n-ch "5 mapr i n 


1 710 "" 
IrlU 


Q7 


1019 


AF183417 


Homo 
sapiens 


microtubule -associated 
proteins 1A/1B light chain 3 


631 


100 


1020 


AF164795 


Homo Rani p n k 


oca. icyuiautiu piULclil J CUIUS a 


b /4 


100 


1021 


AF190625 


Cotufnix 
coturnix 


qdgl - 1 


638 


96 


1022 


AL133363 


Arabidopsis 
thaliana 


putative protein 


155 


37 


1023 


AB034912 


Homo sapiens 


WD-repeat like sequence 


2483 


100 


1024 


AY007091 


Homo R^njpnn 


similar to Homo sapiens 
mammalian inositol 

hexaki "inhnqnhat-p H naop "5 
wwvaivJLO^iiiuo^iiciuc Jviildbt. 4S 

(IP6K2) mRNA with Ge 


2243 


100 


1025 


X69910 


Homo sapiens 


P63 protein 


2958 


Q Q 


1026 


U80736 


Homo sapiens 


CAGF9 


1657 


100 


1027 


AB029333 


Hal nrvnt" Y\i a 

ror et zi 


HrPET- 1 


1048 


54 


1028 


AB032931 


Homo sapiens 


ubiqui tin- conjugating enzyme 
isolog 


1045 


100 


1029 


G01797 




Human secreted protein, SEQ 
ID NO j 5878. 


749 


98 


1030 


G01797 




UnmPn cor^^Ai'A^ v m s.v m e^t^'aA v-« OTTO 

nunian oeciCLcu protein, ^iiy 
ID NO* 5B7R 

X i-> i'V . JO 1 O i 


749 


98 


1031 


AF193795 


Homo sapiens 


vacuolar sorting protein 
VPS29/PEP11 


960 


100 


ioii 


AJ222968 


Mus musculus 


L-periaxin 


120 


30 


1033 


Z81317 


Schizosaccha 

romyces 

pombe 


DNA2-NAM7 helicase family 


685 


31 


1034 


Y41519 


Homn c; a -pi -j ones 


rx-dyuiuiiL ui iiuuiu.ii secreucQ 
protein encoded by gene 75. 


1321 


99 


1035 


AJ276004 


Mus musculus 


Paxneb protein 


1709 


77 


1036 


AF025459 


Caenorhabdit 
is elegans 


HI 4AT 0 ~K crpn^ nrnrtunt* 


l on 


J U 


1037 


U37251 


Homo sapiens 


Description: KRAB zinc finger 
protein; this is a splicing 
supplied by author 


196 


43 


1038 


W74580 


Homo 
sapiens 


Human membrane protein 
BA0306 . 


1921 


97 


1039 


U88173 


Caenorhabdit 
is elegans 


weak similarity to 
Arabidopsis thaliana 
ubiquitin-like protein 8 


331 


80 
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TABLE 2 



PCT/US00/34263 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1040 


AF290204 


Homo sapiens 


blood group carrier molecule 
D0K1 


1637 


99 


1041 


Y96730 


Homo 
sapiens 


PR0539, a Costal-2 homologue. 


162 


22 


1042 


AF140683 


Mus musculus 


F-box protein FWD2 


2397 


98 


1043 


AF151023 


Homo sapiens 


HSPC189 


1104 


100 


1044 


AF181631 


Drosophila 
melanogaster 


BCDNA.GH04929 


204 


37 


1045 


Y77985 


Homo sapiens 


Human collectin amino acid 
sequence . 


194 0 


100 


1046 


AJ243972 


Homo sapiens 


6 -phosphocrluconolactonase 




inn 

1UU 


1047 


AB035863 


Homo sapiens 


ATP specific succinyl CoA 
synthetase beta subunit 
precursor 


2324 




1048 


AL034550 


Homo sapiens 


dull84F4.2 (novel protein 
similar to nucleolar protein 
4 (NOL4) (NOLP)) 


981 


92 


1049 


AF163825 


Homo sapiens 


pre-B lymphocyte protein 3 


634 


100 


1050 


AF201949 


Homo sapiens 


60S ribosomal protein L30 
isolog 


868 


100 


1051 


AF190624 


Mus musculus 


mdgl-l 


236 


85 


1057 


AE003529 


Drosophila 
melanogaster 


CG6151 gene product 


160 


44 


1053 


G0U91 


Homo sapiens 


Human secreted protein, SEQ 
ID NO : 5272 . 


646 


98 


1054 


AL162756 


Neisseria 
meningitidis 


Glu-tRNA(Gln) 

amidotransf erase subunit A 


682 


44 


1055 


AF181856 


Rattus 
norvegicus 


tRNA selenocys teine 
associated protein 




99 


1056 


U89649 


Chlamydomona 
s 

reinhardtii 


1 1 x / \j \j u uulci al ill u.y He XI l 

light chain 


244 


34 


1057 


AF159141 


Homo sapiens 


breast cancer metastasis- 
suppressor 1 


663 


53 


1058 


AF230929 


Homo 
sapiens 


keratinocyte annexin-like 
protein pemphaxin 


1710 


99 


1059 


AJ270952 


Homo sapiens 


putative membrane protein 


1363 


100 


1060 


AF224263 


Heterodontus 
f rancisci 


HoxD8 


742 


83 


1061 


X63417 


Homo sapiens 


IRLB 


1037 


100 


1062 


AL079345 


Streptomyces 
coelicolor 
A3 (2) 


hypothetical protein 


143 


27 


1063 


Y71112 


Homo sapiens 


Human Hydrolase protein- 10 
(HYDRL-10) . 


2547 




1064 


AF263 614 


Homo sapiens 


acetyl -CoA synthetase 


3 4 93 




1065 


Y13356 


Homo sapiens 


Amino acid sequence of 
protein PR0221. 


1363 


100 


1066 


AC006153 


Homo sapiens 


similar to Aquifex aeolicus 
GTP-binding protein; similar 
to AE000771 (PID:g2984292) 


662 


98 


1067 


Y18930 


Sulfolobus 
solf ataricus 


hypothetical protein 


162 


29 


1068 


R65969 


Homo 

sapiens T98G 


Glioblastoma-derived 
nolvtieot idp 


887 


100 


1069 


Y07964 


Homo sapiens 


Human npnyphprl nrnhe^ n 

fragment 


863 


96 


1070 


AF177476 


Rattus 
norvegicus 


CDK5 activator-binding 
protein 


1995 


86 


1071 


AF245505 


Homo sapiens 


adlican 


3109 


99 


1072 


U92794 


Mus musculus 


alpha glucosidase II, beta 
subunit 


147 


36 


1073 


G03889 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7970. 


698 


98 


1074 


U15779 


Homo sapiens 


p70 


380 


28 


1075 


Y13392 


Homo sapiens 


Amino acid sequence of 


1271 


91 
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ACCESSION 
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SCORE 


% 

XUiLVi X X I X 








protein PR0328. 






1076 


AF161457 


Homo sapiens 


HSPC339 


571 


100 


1077 


Y79509 


Homo sapiens 


Human carbohydrate-associated 
protein CRBAP-5. 


2151 


98 


1078 


AF223466 


Homo sapiens 


HT015 protein 


831 


66 ' 


1079 


AL132965 


Arabidopsis 
thaliana 


putative WD-40 repeat-protein 


286 


29 


1080 


AB024937 


Homo sapiens 


LUNX 


1284 


100 


1081 


Y14768 


Homo sapiens 


V-ATPase G-subunit like 
protein 


579 


100 


1082 


AF016416 


Caenorhabdit 
is elegans 


F29A7.4 gene product 


141 


31 " 


1083 


L13291 


Homo sapiens 


ADP-ribosylarginine hydrolase 


802 




1084 


AB041541 


Mus mus cuius 


unnamed protein product 


151 


44 


1085 


G01922 


Homo sapiens 


Human ciPCYPt&ci nrnhpin 
ID NO: 6003 . 


ZU^ 


97 


1086 


AB030814 


Homo sapiens 


H-REV107 protein homolog 


OJJ 


1 Aft 

1UU 


1087 


AF151638 


Homo sapiens 


Dhosoha t idvl ehnl inp tranqfAr 
protein 


11/11 

XXHZ. 


100 


1088 


Y84432 




Amino acid spmipnrp of a 
human RNA-associated 
protein . 


A / OJ 


100 


1089 


Y94867 


Homo 
sapiens 


Human protein clone HP10563 . 


613 


100 


1090 


AK023982 


Homo sapiens 


uiuiaiucu ^iiui»clJl piUUUCC 


130 


49 


1091 


AB041586 


Mus musculus 


unnamed protein product 


1103 


81 


1092 


Y71277 


Homo sapiens 


Human Zlipo3 protein. 


606 


100 


1093 


1134973 '"■ 


Mti O miTanit 1 no 
1 JUO IIIUBCUXUB 


protein tyrosine phosphatase- 
like 


1131 


95 


1094 


Y66677 


Homo 


Membrane-bound protein 

ponno ft 


522 


56 


1095 


Y87276 


Homo sapiens 


Human m nrna 1 i-»or-»t- i' rJia 

containincr nrnhpin uqbd.C'j 
SEQ ID NO: 53 . 


1029 


99 


1096 


Y87276 


Homo sapiens 


Human signal peptide 
containing protein HSPP-53 
SEQ ID NO: 53. 


863 


no 
JO 


1097 


AF161455 


Homo sapiens 


HSPC33 7 


742 


OQ 
J o 


1098 


U80029 


Caenorhabdit 
is elegans 


similar to thioredoxin 


242 


39 


1099 


AJ005666 


Homo sapiens 


Sqv-7-like protein 




99 


1100 


AJ005866 


Homo sapiens 


Sqy-7-like protein 


1118 


99 


1101 


AJ005866 


Homo sapiens 


Sqv-7-like protein 


891 


99 


1102 


AJ005866 


Homo sapiens 


oy» f x -L/vc prouci.n 


1016 


99 


1103 


AL110244 


Homo sapiens 




299 


31 


1104 


AF242194 


Drosophila 
melanogaster 


brakeless-B 


147 


52 


1105 


AL031010 


Homo sapiens 


dJ422F24.1 (PUTATIVE novel j 
protein similar to C. elegans 
C02C2.5) 


968 


100 


1106 


U28016 


Mus musculus 


(phosphodiesterase) -related 

orotpi n 


1624 


87 


1107 


AJ278150 


Homo sapiens 


r\l ■ 1" a K i ifQ ~l 1 l-i i <-3 Irn n-tna 

puLdLive iipiu Jvmase 


2207 


99 


1108 


G03733 


Homo sapiens 


Human secreted protein, SEQ 

ID NO • 7fl 1 4 


495 


98 


1109 


AF217287 


Drosophila 
melanogaster 


G protein RhoBTB 


834 


54 


1110 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


941 


48 


1111 


Y28921 


Homo 
sapiens 


Human regulatory protein 
HRGP-7. 


1331 


51 


1112 


AF176704 


Homo sapiens 


F-box protein FBX9 


2027 


99 


1113 


AF182076 


Homo 
sapiens 


glioma tumor suppressor 
candidate region protein 2 


2418 


100 


1114 


G04039 


Homo sapiens 


Human secreted protein, SEQ 


475 


96 
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1 DR^rRTPTTDN 
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% 

IDENTITY 








ID NO: 8120. 






1115 


AF229439 


Muo mue cuius 


zinc finger protein 289 


1697 


91 


1116 


L40357 


Homo sapiens 


tujr i. VJ.U J.Cl>C^LUi i,ilUCLuL> UUI. 


D V? 


10 0 


1117 


L40357 


Homo sapiens 


thyroid receptor interactor 


404 


85 


1118 


A12155 




Unman Y^T. r>rVMZi 


1673 


10D 


1119 


AL161542 


thaliana 


isomerase like protein 


607 


53 


1120 


AL023754 


Homo tjani pti c; 


UU<1 f . 1 \ rtd u 

v-ci*s t / uiimoaui in uepenuenc 
Protein Kinase LIKR oroteinl 


2341 


98 


1121 


Y57901 


Homo sapiens 


HumaTl f r a n cm<s mhra n a rvvr\ H ^ n 
It UUIui J> Li CLllOMICUUUI. dilC L/LULcJ.11 

HTMPN-25. 


JZl 


36 


1122 


Z14122 


Xenopus 
laevis 


XLCL2 




77 


1123 


AP225418 


Homo sapiens 


lipase 


1531 


97 


1124 


Y06518 


tTfiino qani one 


^eii uiFdse ljiceracciny 


3 227 


100 


1125 


AL035690 


Homo sapiens 


dJ202 121.1 (novel protein) 


952 


100 


1126 


AJ000217 


nuuivj scipx'wiio 




1286 


99 


1127 


AB030505 


Mus musculus 


UBE-1C2 


1069 


79 


1128 


Y73375 


Homo sapiens 


HTRM clone 14 2783 8 protein 
sequence , 


874 


100 


1129 


Y78941 


Homo sapiens 


Cyclophi-lin-type pep t idyl 
prolyl cis/trans isomerase 
amino acid sequence . 


877 


100 


1130 


AL023553 


Homo sapiens 


dJ347H13.4 (novel protein) 


557 


100 | 




VQ1 QAC. 


Homo sapiens 


Human chape rone protein 6 
(HCHP-6) . 


1408 


100 


1132 




Schizosaccha 

romyces 

pombe 


putative nuclear pore protein 


596 


39 


1133 


Z*8197 


Schizosaccha 

ArriA r a 

jTomyccB 
pombe 


putative nuclear pore protein 


389 


35 


1134 


AF180681 


ilUlLIU OCt^J JL^ZIlO 


guanine nucleotide exchange 
factor 


3597 


100 


1135 


AF079765 




enhancer of polycomb 


2o4 


41 


1136 


M62419 


Mus musculus 


clathrin-associated protein 


2189 


99 


1137 


AJ006219 


melanogaster 


clathrin-associated protein 


1254 


78 


113 8 


Y76218 


Homo sapiens 


Human secreted protein 
encoded by gene 95 . 


440 


98 


1139 


W8RT 04 

n OCX 


Homo 
sapiens 


A Rab protein designated 
HRABS-2 . 


1065 


99 


1140 


Y134 0 1 


Homo sapiens 


Amino acid sequence of 
protein PR033 9. 


3979 


98 


1141 


no 


Chimeric ~ 
Homo sapiens 


Green fluorescent protein- 
Zap70 fusion product. 


3309 


100 


1142 


Y13402 




Amino acid sequence of 
protein PRO310 . 


1694 


99 


1143 


G03875 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7956. 


660 


99 


1144 


Y12917 


Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


750 


98 ; 






Homo sapiens 


Amino acid sequence of a 
human secreted peptide. 


1096 


100 


1146 


AL022157 


Homo datili^riQ 


(PROTEIN DXF34) ) 


1233 


100 


1147 


AL022157 


Homo sapiens 


SPIN (SPINDLIN H0M0L0G 
(PROTEIN DXF34) ) 


1233 


100 


1148 


G02548 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6629. 


370 


98 


1149 


Y73338 


Homo sapiens 


HTRM clone 2019742 protein 
sequence . 


1492 


100 


1150 


W74841 


Homo sapiens 


Human secreted protein 
encoded by gene 113 clone 


228 


55 
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% 
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HEAAR60 . 






1151 


AF0442 01 


RattuB 
norvegicus 


neural membrane protein 35; 
NMP35 


1570 


92 


1152 


AF1 56 774 


Homo 
sapiens 


lysophosphatidic acid 
acyl transferase- gamma 1 


1855 


99 


lib J 


ALII 85 01 


Homo sapiens 


dJ1191Nl6.1 (A novel protein 
(translation of the cDNA 
IJKFZpbooA094 6 , Em : AL050069 )] 


872 


64 


1154 


AF131852 


Homo sapiens 


Unknown 


473 


100 


1155 


Y41705 


Homo 
sapiens 


Human PR0352 protein 
sequence. 


1381 


97 


1156 


G04036 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8117 . 


607 


99 


1157 


AF112444 


Lupinus 
luteus 


L-asparaginase 


287 


43 


1158 


AF151848 


Homo sapiens 


CGI-90 protein 


232 


32 


1159 


AJ2 722 67 


Homo sapiens 


choline dehydrogenase 


2449 


100 


1160 


AB001773 


Ciona 
savignyi 


PEM-6 


196 


33 


lie 1 ! 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO: 107. 


746 


83 


1162 


Y87330 


Homo sapiens 


Human signal peptide 
containing protein HSPP-107 
SEQ ID NO : 107 . 


746 


83 


lib J 


AF113534 


Homo sapiens 


HP1-BP74 protein 


2723 


96 


1164 


AF232226 


Danio rerio 


Deddl 


191 


41 


1165 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em : AL05 0069) ) 


1051 


71 


1166 


AL118501 


Homo sapiens 


dJ1191N16.1 (A novel protein 
(translation of the cDNA 
DKFZp566A0946, Em:AL050069) ) 


945 


76 


1167 


AF187733 


Homo sapiens 


syntaphilin 


831 


42 


1168 


AB019435 


Homo sapiens 


phospholipase 


951 


55 


1169 


AF064604 


Homo sapiens 


KE03 protein 


324 


33 


1170 


Y01164 


Homo sapiens 


Polypeptide fragment encoded 
by gene 6 . 


1191 


100 


1171 


1)0318 8 


Saccharomyce 
s cerevisiae 


putative 


180 


22 


1 1T3 
XX 1 Z. 


ft T71 T 1 T C T 

Aril J fox 


Mus musculus 


nuclear pore membrane 
glycoprotein POM210 


3941 


81 


XX / j 


AO 2 4 b 4 1 / 


Homo sapiens 


G5b protein 


794 


100 


1174 


AL022238 


Homo sapiens 


dJ1042K10.3 (novel protein) 


1285 


100 


1175 


U41278 


Caenorhabdit 
is elegans 


F33G12.3 gene product 


332 


28 


1176 


M35617 


Homo sapiens 


T-cell receptor V-alpha-J- 
alpha region 


284 


83 


1177 


AC012680 


Arabidopsis 
thaliana 


putative protein phosphatase 
2C; 55455-56414 


209 


37 


1178 


G01345 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 5426. 


692 


99 


1179 


AL096767 


Homo sapiens 


dJ579N16,3 (novel protein 
similar to worm, Arabidopsis 
and pine proteins) 


1342 


100 


1180 


AF039716 


Caenorhabdit 
is elegans 


similar to ATP synthase B 
chain 


496 


55 


1181 


Y11710 


Homo sapiens 


collagen type XIV 


1048 


97 


1182 


X82240 


Homo 
sapiens] 
>R94 974 
R94974 09- 
MAY-1996 27- 
OCT-1994 
Human TCL-1 
polypeptide . 


T cell leukemia /lymphoma 1 


617 


100 
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DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 






[Homo 
sapiens 








1183 


U42841 


Caenorhabdit 
is elegans 


short region of weak 
similarity to collagen 


161 


33 


1185 


MU X O JL D ± o 


Homo sapiens 


dicarboxylate carrier protein 


1470 


99 


1186 


L27645 


Danio rerio 


growth-associated protein 


130 


36 


1187 


Y02738 


Homo sapiens 


Human secreted protein 
encoded by gene 89 clone 

UT TJT7DH 1 

ru_irlr ruj . 


636 


100 


1188 


AF217544 


Xenopus 
laevis 


ornithine decarboxylase- 2 


1459 


60 


1189 


AL136307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


182 


33 


i i on 


AO "uUZ 


Homo sapiens 


rTSbeta 


197 


100 


1191 


U32828 


Haemophilus 

influenzae 

Rd 


ribosomal protein S6 
modification protein (rimK) 


268 


31 


1192 


AF 154831 


Rattus 
norvegicus 


PV-1 


1403 


60 


1193 


Y50924 


Homo sapiens 


Human fetal brain cDNA clone 
vcl6__l derived protein. 


918 


100 




At* 02 653 0 


Rattus 
norvegicus 


stathmin-like-protein splice 
varianl RB3 ' ' 


1093 


97 


1195 


U35244 


Rattus 
norvegicus 


vacuolar protein sorting 
homolog r-vps3 3a 


2981 


96 


ny b 


Y7047Q 


Homo sapiens 


Human p53 target molecule, 
PRG3 protein . 


1680 


100 


1197 


AF157318 


Homo sapiens 


AD-017 protein 


912 


47 


1198 


AF125443 


Caenorhabdit 
is elegans 


contains similarity to S. 
pombe phosphatidyl synthase 

1GB : Z2 82 95 / 


460 


39 


1 1 QQ 




Homo sapiens 


DC12 


1649 


88 


1200 


fiT.ft - * 1 77c; 


Homo sapiens 


dJ30M3.3 (novel protein 
similar to C. elegans 
Y63D3A.4) 


1902 


100 






Ovis aries 


BIIIB4 hign-sultur keratin 


484 


82 


1202 


Z85986 


Homo sapiens 


dJ108K11.3 (similar to yeast 
suppressor protein SRP40) 


1143 


75 


1203 


U18762 


Rattus 
norvegicus 


retinol dehydrogenase type I 


890 


52 


1204 


U35730 


Mus musculus 


— T" r- 1 ' 

3erky 


2235 


76 


1205 


AB002327 


Homo sapiens 


KIAA03 29 


151 


24 


1206 


AB019233 


Arabidopsis 
thaliana 


ubiquinone /menaquinone 

biosynthesis 

methyl transferase- like 


762 


56 


12 07 


AIil3 6307 


Homo sapiens 


dJ380B8.2 (Neuritin, a 
protein which promotes 
neurite outgrowth) 


742 


100 


U o 


Ar^O /989 


Homo sapiens 


orphan G-protein coupled 
receptor 


2326 


100 


1 "> ft Q 


•7Q"7<n ft 

ZiS /dj 0 


Homo sapiens 


dJ466N1.4 (novel protein 
similar to ANK3 (ankyrin 3, 
node of Ranvier (ankyrin 
G) ) ) 


181 


44 


1210 


U21549 


Mus musculus 


Ac39/physophilin 


1280 


68 


1211 


Y27700 


Homo sapiens 


Human secreted protein 
encoded by gene No. 12. 


1267 


100 


1212 


AF117814 


Mus musculus 


odd-skipped related 1 protein 


945 


6" 6" 


1213 


AF277233 


Naegleria 
fowleri 


calcineurin B 


222 


39 


1214 


D14849 


Mus musculus 


melosis-specif ic nuclear 
structural protein 1 


1950 


77 


1215 


G03022 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7103. 


590 


100 


1216 


Z72510 


Caenorhabdit 


similarity to yeast UTR3 


634 


49 
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nT7COD T DTTftM 


SMI IH- 
WATERMAM 

SCORE 


% 

T nPMTTTV 






is elegans 


protein (Swies Prot accession 
yk677hll.5 comes from this 
gene 






1217 


249703 


Saccharomyce 
s cerevisiae 


unknown 


134 


22 


1218 


AC013430 


Arabidopsis 
thaliana 


F3F9.18 


199 


29 


1219 


L10910 


Homo sapiens 


splicing factor 


1026 


71 


1220 


Z70750 


is elegans 


d ~i tti "fin ~y~ t~ /~\ i ra t-i ^ /^J ^ f- q 

oiuuiai to Valiauatc 

resistance protein 

trarn3ttipmhrannn«! rompe -F-rom 
this gene 


9 65 


58 


1221 


AL163815 


Arabidopsis 
thaliana 


cuf at"i vp nrnl-pin 
juui.auJ.vc; jji. utc iu 


C C "3 
D9J 


61 


1222 


AF155100 


Homo sapiens 


zinc finaer Drotein ny-rfm-91 
antigen 


2261 




1223 


J05071 


Bos taurus 


GTP-binding regulatory 
protein gamma- 6 subunit 


ODD 


inn 


1224 


Y73364 


Homo sapiens 


HTRM clone 2765991 protein' 
sequence . 


1169 


99 


1225 


AL050170 


Homo sapiens 


hypothetical protein 


714 


100 


1226 • 


XG4002 


Homo sapiens 


RAP74 


2661 


99 


1227 


X04085 


Homo sapiens 


catalaee 


2846 


100 


1228 


AJ005620 


Mus musculus 


skeletal muscle-specific gene 


1416 


90 


1229 


AF045564 


Rattus 
norvegicus 


development -related protein 


1715 


93 


1230 


X97571 


Mus musculus 




4 /y 


96 


1231 


L08239 


Homo sapiens 


located at 0ATL1 


2274 


100 


1232 


AF121863 


Homo sapiens 


sorting nexin 14 


1964 


100 


1233 


API ?i fifil 

" C JL £. JL © V J 


Homo sapiens 


sorting nexin 14 


1203 


84 


1234 


AC0248 OS 


Paennrhahrli t" 

■j c pi ananQ 
•L o c i cy alio 


contains similarity to 


744 


31 


1235 


AC006634 


Capnofha h>d "5 1~ 
is elegans 


Saccharomyces cerevisiae 
YLR418C (GB:U20162) 


357 


33 


1236 


Y18101 


Mus musculus 


tyrosine-phosphorylated 
protein 


1559 


87 


1237 


AB042646 


Homo sapiens 


TGIF2 


1224 


100 


1238 


AB026264 


Homo sapiens 


IMPACT ~~ — 


1694 


100 


1239 


AB026264 


Homo sapiens 


IMPACT 


1123 


100 


1240 


G00429 


Homo dam Ana 


Human secreted protein, SEQ 

ID NO* 4 510 


324 


• 100 


1241 


Y76144 


Homo sapiens 


encoded by gene 21. 


Ufa j 


53 


1242 


AL035602 


Arabidopsis 
thaliana 


putative protein 


4 99 


O Q 
£. O 


1243 


X76483 


Gallus 
gallus 


Yes-associated protein 
(65kDa) 


574 


48 


1244 


AF220186 


Homo sapiens 


uiiwiai attci J. icu ijyyjLiia±amus 
protein HT012 


503 


100 


1245 


AL021453 


Homo sapiens 


UUO^lUJLl. J IrUlMll vil prOCclIl/ 


Br/! 

656 


100 


1246 


AJ276003 


Homo sapiens 


GAR1 protein 


1216 


100 


1247 


Y57910 


flvJIUU sapiens 


Human transmembrane protein 
HTMPN-34. 


1369 


98 


1248 


AC004874 


Homo sapiens 


oi.m.1, ,i,ai L- w i.n 

acetylgalactosaminyltransfera 
se; similar to Q07537 
(PID:gll71989) 


OCT 
53/ 


t n n 

1UU 


1249 


AF199597 


Homo 
sapiens 


A- type potassium channel 
modulatory protein 1 


1139 


100 


1250 


Y13148 


Rattus 
norvegicus 


PAG608 


1350 


88 


1251 


M24852 


Rattus 
norvegicus 


neuron- specific protein PEP- 
19 


124 


46 
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on j. in 
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IDENTITY 


1252 


AF146738 


Rattus 
norvegicus 


testis specific protein 


771 


83 


1253 


G02725 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6806. 


413 


97 


1254 


W44375 


Homo sapiens 


Human ubiqui tin-conjugating 
enzyme polypeptide. 


1045 


99 


1255 


AC006538 


Homo sapiens 


BC41195_1 


831 


78 


1256 


AB004316 


Bos taurus 


mitochondrial methionyl- tRNA 
trans formylase 




bp" 


1257 


Z35094 


Homo sapiens 


SURF-2 


1354 


97 


1258 


Y13362 


Homo sapiens 


Amino acid sequence of 
protein PR0214 . 


'JO J 


xUU 


1259 


Adb0£oi4 


Homo sapiens 


similar to RFP t* y a n a tttit nn 
protein,- similar to P14373 
(PID:gl32517) 




lUU 


1260 


AC005099 


Homo sapiens 


match to AI222572 
(NID:g3804775) 


469 


100 


1261 


V00507 


Homo sapiens 


coding sequence of DHFR (1 is 
1st base in codon) (561 is 
3rd base in codon) 


jot 




1262 


X15443 


Rattus sp. 


gamma -glutamyl transpeptidase 
(AA 1-568) 


697 


32 


1263 


AF173871 


Mus musculus 


neuronal PAS3 






1264 


AP178983 


Homo sapiens 


Ras-associated protein Rapl 


433 




1265 


Y70473 


Homo sapiens 


Human cyclic nucleotide- 
associated Drotein-l fPNAP- 

1) - 


2785 


99 


1266 


Y41738 


Homo 
sapiens 


Human PR0541 protein 
sequence . 




inn 
XUU 


1267 


AF061346 


Mus musculus 


Edpl protein 


1077 


64 


1268 


U97006 


Cae nor habdi t 
is elegans 


C13F10 4 rjF»T) p> n rnrli 1 n t" 


154 


23 


1269 


AF233582 


Mus musculus 


GTPase Rab37 


$42 


95 


1270 


AF195951 


Homo sapiens 


sicrnal recoonition Darhirlp 
68 


J X^S / 




1271 


AL031177 


Homo sapiens 


dJ889M15.3 (novel protein) 


1150 




1272 


AF201933 


Homo sapiens 


DC11 


650 


100 


1273 


AF201933 


Homo sapiens 


DC11 


346 


98 


"1274 


AL02171O 


Arabidopsis 
thaliana 


putative protein 


348 


A O 


1275 


AC004449 


Homo sapiens 


R33683_3 


556 j 


100 


1276 


Y86295 


Homo sapiens 


Human secreted protein 
HL2AGB7 , SEQ ID NO:210. 


1920 


inn 

XUU 


1277 


Y71111 


Homo sapiens 


Human Hydrolase protein- 9 
(HYDRL-9) . 


19 / © 


Q Q 


1278 


S94421 


Homo sapiens 


T cell receptor eta-exon 


47B 


100 


1279 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


1909 


i nn 

XUU 


1280 • 


AF1613B0 


Homo sapiens 


HSPC262 


772 


100 


1281 


Y48610 


Homo sapiens 


Human brea3t tumour- 
associated protein 71. 


779 


XUU 


1282 


AC015446 


Arabidopsis 
thaliana 


Similar to AIG1 nrotein 


*i U 0 


T C 

J b 


1283 


AK024432 


Homo sapiens 


PLJ0005 9 r>^ntp> -i n 


U.3 


35 


1284 


W96153 


Homo sapiens 


Human FADD- interacting 

piULcin \CXfJ . 


1825 


81 


1285 


AJ001019 | 


Homo sapiens 


ring finger protein 


1301 


100 


1286 


AE003823 


Drosophila 
melanogaster 


CG13178 gene product 


195 


29 


1287 


AF170632 


Homo sapiens 


FEM-l-like death receptor 
binding protein 


3 261 


100 


1288 


AC006033 


Homo 
sapiens 


similar to MLN 64; similar to 
138027 (PID:g2135214) 


1195 


100 


1289 


ACD06033 


Homo 
sapiens 


similar to MLN 64,* similar to 
138027 (PlD:g21352l4) 


668 


93 


1290 


AB023811 


Homo sapiens 


TU3A 


351 


54 
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TABLE 2 



SEQ 
ID 
NO : 


ACCESSION 


SPECIES 


| DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1291 


Z73424 


V_cicIlCJI J ldLJLlJ. L 

is elegans 


L^lfJDJ . 1 


235 


36 


1292 


Y94871 


Homo 
sapiens 


riuman protein cione Ht , 0i£b5l. 


1222 


100 


1293 


AF1B0425 




re tinoblas t oma— associated 
protein RAP140 


489 


29 


1294 


G03856 


Homo sapiens 


Human secreted protein, SEQ 

TD NO- 7917 


538 


99 


1295 


AF133670 


Ml \ cj mi i onii I u n 
1 'Ha IHUoLUlUD 


- o inueracc xny prOC6in-z 


367 


51 


1296 


AJ249735 


Homo sapiens 


claudin-6 


1142 


100 


1297 


X57560 


coli 


pspE protein 


535 


100 


1298 


AF169284 


Homo sapiens 


LIM and cysteine -rich domains 
protein 1 


1997 


100 


1299 


U41023 


Caenorhabdit 

la clcyans 


coded for by C. elegans cDNA 
yKo in. j; coaea tor by C - 

wH n qv. q c 
yKX \J3iia . d 


324 


29 


13&0 


AB024523 


Homo sapiens 


basic kruppel like factor 


1206 


100 


1301 


X55989 




eosinophil cationic-related 
protein 


737 


99 


1302 


AF0071^1 


numo bap a. ens 


unknown 


14B1 


100 


1303 


X52904 


•Escherichia 


open reading frame {AA 1-65) 


359 


100 


13 04 


U19577 


17 n r"< V* o "v "i «-»Vi*i n 

CiBciJcricnici 
coli 


galactonate dehydratase 


242 


93 


1305 


AF266508 


Mus musculus 


NELF protein 


1409 


97 


1306 


Y57901 


Homo sapiens 


Human transmembrane protein 
HTMPN-2 5. 


932 


100 | 


13 07 


US 8 750 


Caenorhabdi t 
is elegans 


similar to the mitochondrial 
carrier family 


365 


54 


13 08 


nc u ^± t i i *x 


Homo sapiens 


breakpoint cluster region 
protein 2 


2681 


99 


1309 


AL078593 


Homo sapiens 


dJ210Bl.l (KIAA0680) 


267 


34 


1310 




Homo sapiens 


E48 antigen 


620 


96 


1311 


Z82263 


Caenorhabdit 
is elegans 


C47A4.1 


283 


35 


1312 


AF131218 


Homo sapiens 


chromosome 16 open reading 
frame 5 


1493 


100 


1313 


Y41763 


Homo 


Human PR0938 protein 
sequence . 


1636 


100 


1314 


AF196972 


Homo sapiens 


JM24 protein 


2239 


100 


13 15 




Homo sapiens 


insulin receptor substrate 
like protein 


228 


97 


1316 


Y66695 


Homo 
sapiens 


Membrane -bound protein 
PR01344 . 


"1909 


100 


1317 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


2442 


89 


1318 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1477 


83 


1319 


AF153127 


Gallus 
gallus 


SAPK interacting protein 


1651 


86 


1320 


X56932 


Homo sapiens 


23 kD highly basic protein 


1044 


100 


1321 


AF174605 


Homo 
sapiens] 
>Y83086 
Y83086 09- 

AUG-199B F- 
box protein 
FBP-1B. 
[Homo 
sapiens 


F-box protein Fbx25 


4*7 


70 


1322 


M61732 


Trypanosoma 
cruzi 


neuraminidase 


214 


24 


1323 i 


Y17013 


porcine 
endogenous 


pol | 


304 


64 
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SEQ 
ID 
NO: 


ACCESSION 

WTTMT3n*T3 
IN UFJotLK 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






retrovirus 








1324 


*»-LJ J* «J Q W ^ mJ 


AioO i UOp SIS 
1" ha 1 i ana 


putative protein 


1174 


37 


1325 


AL13 8655 


thaliana 


putative protein 


946 


35 




AL133215 


Homo sapiens 


DA108L7.2 (novel protein 
similar to rat cricarooxyiate 
carrier) 


1322 


99 ?• 


1327 


AF161541" " 


rnjiiivj bdpienb 




1357 


99 


1328 


Y73346 


Wfxinrt oani on a 
nuiUU ocipicila 


niKM cione oiyoyy protein 
sequence. 


785 


96 


1329 


L10910 


Homo sapiens 


splicing factor 


912 


82 


1330 


API 4£t;£ft 


Homo sapiens 


mili protein 


1936 


100 


1331 


W87772 


Homo sapiens 


Human serum glucocorticoid- 
regulated kinase (H-SGK2) 
polypeptide . 


232 


39 


1332 


V4 1 Til 


Homo 
sapiens 


Human PRO704 protein 
sequence . 


1860 


100 


1333 


AF295096 


Homo sapiens 


zinc- finger protein ZBRK1 


411 


91 


1 "J ^ A 

X J J 4 


Zo22 /l 


Caenorhabdit 
is elegans 


Similarity to Mouse kinensin- 
like protein KIF4 comes from 
this gene 


578 


44 


lilt; 
J. J j j 


Ab U U U a 1 U 


Me thanobac te 
rium 

thermoautotr 
op hi cum 


conserved protein 


290 


43 


1336 


Y68779 


Homo sapiens 


Amino acid sequence of a 
human phosphorylation 
errector fHbP-ii. 


1019 


91 


1337 


AB027003 


Mus musculus 


protein phosphatase 


378 


B4 


133 8 


ri£ A DEC 


Caenorhabd i t 
is elegans 


weak similarity to TPR 
domains 


215 


40 


1339 


AE001394 


Plasmodium 
falciparum 


protein of the YMR7 family 


170 


29 


1340 


X76717 


Homo sapiens 


MT-11 protein 


204 


89 


T 1 A 1 
1 J4 1 


AC011914 


Arabidopsis 
thaliana 


putative mutT protein; 68398- 
678B1 


289 


45 


1342 


AJ276171 


Homo sapiens 


ASPIC 


2122 


100 




Arib /Ulb 


Homo sapiens 


myosin regulatory light chain 
interacting protein MIR 


2303 


99 


± J H fi 


Bonn coil 
At- U u b y o j 


Homo sapiens 


similar to Kelch proteins; 
similar to BAA77027 
(PID:g4650844) 


894 


35 


1345 


AF25746"6' 


Homo sapiens 


N-acetylneuraminic acid 
phosphate synthase 


1880 


99 


1346 


Y2S896 


Homo sapiens 


Human secreted protein 
fragment encoded from gene 
64. 


1148 


100 


1347 


0 71 


lurpGuo 

Hid iTUOITcl Ld 


male sterility protein 2-like 
protein 


1664 


58 


1348 


AF16154 ft 


Homo sapiens 


HSPC063 


1018 


9B 


1349 


W78128 


Homo sapiens 


Human secreted protein 
encoded by gene 3 clone 

nUi>BJ.9b . 


1117 


100 


1351 


G02144 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6225. 


418 


100 


1352 


D90869 


Escherichia 
coli 


similar to 


■? fid. 7 


100 


1353 


A12029 


Homo sapiens 


MRP- 14 


613 


100 


1354 


AC005328 


Homo sapiens 


R26660_l, partial CDS 


870 


74 


1355 


AC024876 


Caenorhabdit 
is elegans 


contains similarity to 
SW:RPB1_CRIGR 


829 


61 


1356 


AF077226 


Homo sapiens 


copine III 


1876 


£4 


1359 


AF217188 


Mus musculus 


YIP1B 


801 


63 


1360 


AC074331 


Homo sapiens 


2NF234 


3869 


100 


1361 


AL163279 


Homo sapiens 


homolog to cAMP response 


5035 


99 
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TABLE 2 



SEQ 
XV 
NO • 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








element binding and beta 
transducin family proteins 






JL J O £. 


Z48475 


Homo sapiens 


glucokinase regulator 


3160 


99 


1363 


Z48475 


Homo sapiens 


glucokinase regulator 


2682 


97 


13 64 


AF195764 


Homo sapiens 


megakaryocyte-enhanced gene 
transcript 1 protein; MEGT1 
protein 


2055 


99 


1 ICC 

Xjoj 


At 11001/3 


Homo sapiens 


PR00915 


581 


100 


1366 


AF116609 


Homo sapiens 


PRO0915 


581 


100 


1367 


AL117352 


Homo sapiens 


dJ876B10.3 (novel protein 
similar to C. elegans 
T19B10.6 (Tr:Q22557) ) 


2581 


99 


13 DO 


Y34124 


Homo 
sapiens 


Human potassium channel 
K+Hnovl5 . 


1342 


100 


1369 


AJ245621 


Homo sapiens 


CTL2 protein 


3728 


99 


X3 /U 


AFOOB220 


Bacillus 
subtilis 


YtaG 


429 


45 


1371 


X05562 


Homo sapiens 


alpha- 2 chain precursor (AA - 
25 to 1018) (3416 is 2nd base 
in codon ) 


5908 


99 


1 3 72 


7 Q Q ft/1 Q 


— — , — 

Homo sapiens 


dJ408N23.4 (novel DnaJ domain 
protein) 


1296 


99 






Homo sapiens 


FLASH 


10253 


100 


1374 


U202B6 


Rattus 
norvegicus 


lamina associated polypeptide 
1C 


1567 


69 


13 /ij 


U53445 


Homo sapiens 


boci 


1645 


46 


i J /b 


ALII /33 7 


Homo 
sapiens 


bA393J16.1 (zinc finger 
protein 33a (K0X 31) ) 


250 


60 


1 

1j / / 


ALUUD3 zc 


Homo sapiens 


R26660_l, partial CDS 


1126 


100 


1378 


U35H3 


Homo sapiens 


metastasis-associated gene 


1823 


69 


1 no 


L153 13 


Caenorhabdi t 
is elegans 


putative 


858 


58 


13 80 


X ZD / jo 


Homo sapiens 


Human secreted protein 
encoded from gene 46 . 


1508 


100 


1381 


AB037360 


Homo sapiens 


ANKHZN 


5734 


95 


1382 


AB037360 


Homo sapiens 


ANKHZN 


959 


97 


13 U3 


AF 23 7676 


Mus musculus 


G beta-like protein GBL 


1721 


96 


13 84 


AF237676 


Mus musculus 


G beta-like protein GBL 


1043 


70 


13 85 


Y58793 


Homo sapiens 


Human calcium regulatory 
protein CaREG-1 . 


715 


100 


1386 


AF212162 


Homo sapiens 


ninein 


10369 


99 


13 87 


AL031685 


Homo sapiens 


dJ963K23.2 (novel protein) 


337 


33 


1388 


AC004890 


Homo sapiens 


similar to zinc finger 
proteins; similar to BAA24380 
>W06316 W06316 03-OCT-1996 
27-APR-1995 TRP-1 protein. 


542 


86 




A I? 1 Q7QQQ 

Mia / yuy 


Homo sapiens 


zinc finger protein ZNF223 


2665 


99 


1390 • 


AC035150 


Homo sapiens 


Zinc finger protein ZNF221 


3459 


100 


i j y i 


AF287S94 


Homo sapiens 


PIST 


1410 


97 


i j y £. 


Ar zozzdd 


Homo sapiens 


inner centromere protein 
INCENP 


1794 


99 


13 93 


v n An i /\ 

X90840 


Homo sapiens 


axonal transporter or 
synaptic vesicles 


4584 


99 


1394 


AF076249 


Homo sapiens 


zinc finger protein SBBIZ1 


3208 


99 


1395 


G02224 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6305. 


299 


75 


13 96 




Arab i dop sis 
thaliana 


Similar to 


13 0 


34 


1398 


AF242519 


Homo sapiens 


zinc finger protein SBZF3 


181 


66 


1399 


AL133396 


Homo 
sapiens 


dJ1068H6.4 (prion protein 
like protein doppel) 


962 


100 


1400 


Y48611 


Homo sapiens 


Human breast tumour- 
associated protein 72. 


817 


99 


1401 


AC0044 72 


Homo sapiens 


PI. 11659 5 


280 


54 


1402 


X91489 


Saccharomyce 
s cerevisiae 


putative HMG box 


164 


27 
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SEQ 
ID 
NO : 


ACCESSION 
NUMBER 


SPECIEb 


DESCRIPTION 


SMITH- 
WATERMAN 


% 

IDENTITY 


1403 


Y79222 


"Homo 

Sap 2€ri3 


Human (- >»n n o P a. r*?» a r* TUMQPQ-1 d 


O R A O 


inn 
1U u 


1404 


X8105B 


Mus musculus 


tex261 


1010 


99 


1405 


ARfil onR A 

nDU ItuOl 


Nine mil cfMil 11 q 


ITM 


1 QA 


0 Q 


1406 


AB030251 


Homo sapiens 


GTPase activating protein 


3233 


99 


1407 




rattus 


riu-iij\c protein 


O C Q A 


Q Q 


1 dOfl 
14ug 


A/3 /OU 


UL ObOpilJL id 

melanogaster 


T DD A 1 


1 CA 

3 o4 


29 


1 AOQ 


U / bblo 


Mus musculus 


"M DAD 


804 


4 8 


X 4 X u 




Homo sapiens 


r-iOoo / 1/ partial v^Jjo 


83 5 


63 


1411 


AE000284 


Escherichia 

C011 


orf, hypothetical protein. 


360 


100 


14 1Z 


vm c^o 

AUlbb 6 


Escherichia 
coli 


T C / n XT' \ /^-i"l*l'*Tft\ 

L5 irplE) laa 1-179) 


911 


100 




W78279 


Homo sapiens 


Fragment of human secreted 
protein encoded by gene 33. 


1264 


99 


1414 


AdU3 lUbl 


Homo sapiens 


organic anion transporter 
OATP-E 


3832 


100 


1415 


M17466 


Homo sapiens 


coagulation factor XII 


3455 


100 


1416 


AF097994 


Homo 
sapiens 


L - kynuren i ne / a Ipha - 
aminoadipate aminotransferase 


2202 


99 


1417 


AF151077 


Homo sapiens 


HSPC243 


1262 


99 


1418 


Y09945 


Rattus 
norvegicus 


putative integral membrane 
transport protein 


1098 


61 


1419 


U13152 


Mesocricetus 
auratus 


guanine nucleotide-binding 
protein beta 5 


2179 


76 


1420 


AL.162458 


Homo sapiens 


DA465L10.5 (KIAA1176 (novel 
protein, presumed ortholog 
of mouse K-Cl cotransporter 
KCC2) ) 


5696 


100 


1421 


Y99426 


Homo sapiens 


Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO: 308. 


152 


29 


1422 


•xr ft a ftft 

Y94923 


Homo sapiens 


Human secreted protein clone 
qsl4_3 protein sequence SEQ 
ID NO: 52. 


4039 


99 


1423 


AF177388 


Homo 
sapiens 


cancer-amplified 
transcriptional coactivator 


10748 


99 


1424 


Y48517 


Homo sapiens 


Human breast tumour- 
associated protein 62. 


1851 


99 


1425 


AF208848 


Homo sapiens 


BM-006 


1454 | 


89 


i aoc 

.Iff 


Av / Ubolo 


Homo sapiens 


BM- 006 


853 


79 


1 AO O 




Bos taurus 


differentiation enhancing 
factor 1 


4693 


95 


1428 


U41387 


Homo sapiens 


Gu protein 


1372 


63 


l ao a 


Ai* lb lb J 4 


Homo sapiens 


HSPC049 


2853 


78 


1430 


AF125043 


Mus musculus 


bisphosphate 3 1 -nucleotidase 


275 


30 


1431 


Y66718 


Homo 
sapiens 


Membrane -bound protein 
PRO1106. 


1886 


100 


1 A 1 O 


Ait ly 3613 


Homo sapiens 


cell recognition molecule 
Caspr2 


568 


100 


1433 


AB044560 


Mus musculus 


Gliacolin 


192 


34 


1434 


R99800 


Homo sapiens 


NTII-l nerve protein, 
facilitates regeneration of 
nerve cells. 


707 


51 


1435 


AvoonR'?r) 


n(Jlll(J ctcipxt^rio 


myo- inositol 1 -phosphate 
synthase Al 


2 904 


100 


1436 


X70944 


Homo sapiens 


PTB-associated splicing 
factor 


1261 


72 


1437 


AF271732 


Homo sapiens 


bridging integrator-3 


1282 


100 


1438 


Y30811 


Homo sapiens 


Human secreted protein 
encoded from gene 1 . 


595 


98 


1439 


AJ293659 


Homo sapiens 


mucolipidin 


628 


97 


1440 


AF219138 


Homo sapiens 


GGA3 long isoform 


3083 


100 


1441 


AF219138 


Homo sapiens 


GGA3 long isoform 


3346 


100 



180 
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TABLE 2 



SEQ 
ID 
NO: 


NT7MRPP 




Ubot-KIPllUN 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 


1442 




hTILJIILLJ uaMJLCUD 


ALEX3 


1944 


100 


1443 


AF237711 


Dr os op hi la 
melanogaster 


Diablo 




o n 
z I 


1444 


AiTOll ft 


nomo aapicuo 


oetft protein 


4 39 


39 


1445 


X73874 


Hr»mr» oani <=»na 
liUUlL/ QCipjLGllo 


piiuspxiox.yj.ciac Kinase 


6233 


98 


1446 






ureasi. carcinonia-associcitea 
anticrpn RPAR 


3 999 


99 


1447 


AF003924 


Homo sapiens 


ANC_2H0l 


2645 


99 


1448 


AF003136 


Caenorhabdit 

xo uicy alio 


contains weak: similarity to 
an /n , if~uj.iiLiiiiy moult 


2843 


52 


1449 


AF155112 


Homo sapiens 


NY-REN-50 antigen 


1184 


89 


1450 


X jU U *i 


Homo sapiens 


Human secreted protein 

iinC/ 1 P!?A TO KI/"\ . >1 O 


985 


100 


1451 


AF107203 




ataAiii z w jj j»iiLixiiy proLSin 


688 


57 


1452 


AF107203 


Homo sapiens 


ataxin 2-binding protein 


456 


78 


1453 


23Q011 


Mi 1 e miie^nl lie 




882 


56 


1454 


^ U 3 D O 


IJorflA C! ^ r> ^ ana 


Protein sequence and 
annotation available soon via 
LABEIT@EMBL-Heidelberg.DE 


510 


28 


1455 


/vuu jjIU? 


nonio oapicua 


uuoo4riii.j isimnar to 
sialyl tranf erase) 


13 56 


100 


1456 


D44480 


Mus musculus 


MATH- 2 protein 


272 


100 


1458 


/\r 1*11 J a b 


Homo sapiens 


rna nelicase HDB/DICE1 


478 


45 


1459 


AF242552 


Qallus 
gallus 


retinovin 


945 


34 


i a <t n 


U11036 


Homo sapiens 


Ibdl 


724 


84 


1461 


AB025258 


Mus musculus 


granuphilin-a 


545 


39 


1462 


Y0B134 


Homo sapiens 


acid sphingomyelinase- like 
phosphodiesterase 


2428 


99 


1463 


AC004 997 


Homo sapiens 


match to ESTs 243979 
(NID:g573097) , R19699 
(NTD : g774333 ) 


869 


98 


1464 


AC004997 


Homo sapiens 


match to ESTs 243979 
(NID:g573097) , R19699 
(NID:g774333) 


869 


98 


1465 


U32743 


Haemophilus 
influenzae , 
Rd 


fucose operon protein (fucU) 


315 


50 


1466 


X u J u z z 


Homo sapiens 


Not5£-like protein 


2342 


100 


14^7 


AC003034 


Homo sapiens 


Homolog of rat kidney- 
specific (KS) gene 


1072 


99 


1458 


AF071544 


Spinacia 
oleracea 
i 


ribulose-l, 5-bisphosphate 
carboxylase /oxygenase small 
subunit N-methyltransf erase I 


333 


2^ 


14 69 




Homo sapiens 


Human transmembrane protein 
HTMPN-54 . 


1053 


100 


1470 


AF032666 


Rattus 

T"\ r~\ Y ■ \ fon S one 

llKJL vcyiuub 


rsec5 


4504 


93 


1471 


Y70467 


Homo sapiens 


Human membrane channel 
protein- 1/ {MttCHP-ivj . 


4*2 


74 


1472 


AL031033 


Homo sapiens 


C321D2.1 (Ribosomal Large 
Subunit Pseudouridine 
Synthase protein) 


1694 


100 


1 A *7 1 
13 / J 


AF177292 


Homo sapiens 


genethonin 3 


4026 


98 


14 74 


S4 5936 


Homo sapiens 


HT&l 


1101 


50 


14 /b 


Y86241 


Homo sapiens 


Human secreted protein 

JlU/voKt>U, 1U NQ:lb6. 


1879 


98 


1476 


AJ010317 


Fugu 
rubripes 


Sand 


1278 


68 


1477 


U42831 


Caenorhabdit 
is elegans 


coded. for by C. elegans cDNA 
yk99b4.3; similar to human 
transforming protein 
(PIR:S22157) 


846 


44 


1478 


X62447 


Homo sapiens 


PR 264 


543 


61 


1479 i 


X82209 


Homo sapiens 


MN1 


71l£ 


100 


1480 


U10536 


Pan paniscus 


MHC. class I A 


675 


84 



181 



WO 01/53312 



TABLE 2 



PCT/US00/34263 



SEQ 
ID 

! NO : 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


* 

IDENTITY 


1481 


AL078599 


Homo sapiens 


dJ991C6.1 (novel protein 
similar to C. elegans 
F55A12.9 (Tr:P91086) ) 


1274 


65 


14 82 


«?o VII 


Schi zosa c cha 

rnmvrp r 
i \JW\y C C 0 

pombe 


putative vacuolar protein 


256 


29 


1483 


AB005662 


Mus musculus 


JNK/SAPK-associated protein- 1 


4968 


92 


14 84 




nOlUU BeipiBIlb 


hypothetical protein 


716 


100 


1485 


M27878 


Homo sapiens 


DNA binding protein 


1006 


53 


14RC 


i 0 y ID x 


Homo sapiens 


Amino acid sequence of a 
partial protein kinase. 


575 


99 


±*t a f 


AO fl IDb 


Saccharomyce 
s cerevisiae 


ATH1 


341 


29 


1 A A R 


AT?n 1 QQC1 


Homo s ap i ens 


OVT7V T__"1 ' 

RNA nelicase 


446 


34 


1489 


U56966 


Caenorhabdit 
is elegans 


coded for by C. elegans cDNA 
yk3 0b3.5; coded for by C. 
elegans cDNA yJc3 0o3.3 


620 


42 


1490 


AE0009B9 


Archaeogl obu 
s fulgidus 


enoyl-CoA hydratase (fad-4) 


533 


46 




novo j j 


Rat tus 

norvegicuB 


adenylyl cyclase type IV 


707 


95 


1492 


Y73342 


Homo sapiens 


HTRM clone 2709055 protein 
sequence . 


3513 


99 




XX / AAV 


Homo sapiens 


Human secreted protein (clone 
f j 283 -11 ) . 


462 


37 


1494 


atti -a -a c 7 n 


Mus musculus 


ARL-6 interacting protein-2 


701 


97 


1495 


Y94897 


Homo 
sapiens 


Human protein clone HP10574 . 


1371 


100 


14 96 




Homo sapiens 


dJ747H23.2 (novel protein) 


1550 


100 


1497 


apn-174 a 7 


Homo sapiens 


ribosomal S6 protein kinase 


2427 


100 


1498 


AL445067 


Thermoplasma 
acidophilum 

■■ 


putative target YPL207w of 
the HAP2 transcriptional 
complex related protein 


269 


35 


14 99 


nnni Q Q d 7 


Homo sapiens 


XllL-Dinding protein 51 


227 


36^ 


AOUU 


HJ ^ / / / O V 


Homo sapiens 


UBASH3A protein 


3509 


100 


1501 




Homo 
sapiens 


dJ93K22.1 (novel protein 
(contains DKFZP564B116) ) 


2439 


100 


1502 


AFT 1 ° R Qfi 


Homo sapiens 


TALE homeobox protein Me is 2b 


1140 


100 


1503 


AF178948 


Homo sapiens 


TALE homeobox protein Meis2a 


1177 


100 


J. DUft 




Homo sapiens 


Human secreted protein clone 
pm74 9_8 protein sequence SEQ 
ID NO : 16 . 


1442 


99 


1505 


X82494 


Homo sapiens 


f ibulin-2 


3580 


99 


i jub 


Ay a z y b 


Homo sapiens 


ubiquitin hydrolase 


783 


42 


1507 


AL034548 


Homo sapiens 


dJ1103G7.6 (novel protein) 


109B 


100 




Y76144 


Homo sapiens 


Human secreted protein 
encoded by gene 21. 


1736 


100 


1509 


Af A A U JL O Z 


Homo sapiens 


uncharacterized hypothalamus 
protein HT008 


1181 


98 


1510 




Caenorhabdi t 
is elegans 


Gene probably begins in the 
next cosmid 


415 


58 


T CI "J 




Neurospora 
crassa 


related to MDM1 protein 


196 


29 


XDX<£ 


UX /o29 


Homo 
sapiens 


N-acetylgalactosamine 6- 
sulfate sulfatase (GALNS) 


1829 


100 




At lb 8717 


Homo sapiens 


x 009 protein 


694 


99 


1514 


AJ243531 


Homo cani one 


nM15 protein 


735 


100 


1515 


AC003672 


Arabidopsis 
thai i ana 


putative C3HC4-type RING zinc 
finger protein 


407 


30 


1516 


AF115435 


Rattus 
norvegicus 


syntaxin 17 


1374 


90 


1517 


AF003140 


Caenorhabdit 
is elegans 


C44E4.5 gene product 


274 


31 


1518 


AB002584 


Rattus 
norvegicus 


beta -alanine -pyruvate 
aminotransferase | 


2238 


82 


1519 


AL121764 


Schizoaaccha 


yeast atpl2 protein precursor | 270 


30 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


WATERMAN 
SCORE 


IDENTITY 






romyces 
pombe 


homo log 






1520 


AF255910 


Homo 
sapiens 


vascular endothelial 
junction-associated molecule 


547 


100 


1521 


D31764 


Homo sapiens 


KIAA0064 


170 


27 


1522 


Y66634 


Homo 
sapiens 


Membrane -bound protein 
PRO190. 


985" 


100 


1523 


Y94450 


Homo sapiens 


Human inflammation associated 
protein 


250 


43 


1524 


AC00O1O7 


Arabidopsis 
thaliana 


F17F8.22 


277 


3 7 


1525 


AF109377 


Mus musculus 


ldlBp 


1277 


83 


1526 


AL031427 ' 


Homo sapiens 


dJ167A19.4 (novel protein) 


1432 


99 


1527 


Y0813S 


Mus musculus 


acid s oh i na ottivp 1 inaip-1 -1 U- a 

phosphodiesterase 






1528 


AK024423 


Homo sapiens 


FLJ00012 protein 


611 


1 nh 


1529 


AF154502 


Homo sapiens 


quiescent cell proline 
dipept idase 


679 


100 


1530 


AF205598 


Homo sapiens 


tranSDOSBRP-l "j V nrnfioi n 


X J 0 0 


inn 
1UU 


1531 


AF251039 


Homo sapiens 


putative zinc finger protein 


1420 


jL> 


1532 


W74805 


Homo sapiens 


Human secreted protein 
encoded by gene 77 clone 
HOEAS24 . 


493 


57 


1533 


AF039023 


Homo sapiens 


Ran-GTP binding protein; 
RanBP6 


5707 




1534 


AC007190 


Arabidopsis 
thaliana 


F23N19.9 


374 


1 7 


1535 


AB027564 


Homo sapiens 


DINB1 


4482 


1O0 


1536 


Y36178 


Homo sapiens 


Human secreted protein 


377 


87 


1537 


Y5 0907 


nomu sapiens 


Human fetal brain cDNA clone 
vb3_l derived protein. 


3 693 


99 


1538 


AF017368 


Mus musculus 


faciogenital dysplasia 
protein 2 


177 


47 


1539 


AF266756 


Homo Rani nnc 


Cl y*\ \> «i riff d"iT»c* Vi nsao 

bpningoB x ne Kinase 


2 011 


99 


1540 


248804 


Homo sapiens 


0A1 


2238 


100 


1541 ' 


AF000195 


Caenorhabdi t 
is elegans 


UUllLdl 1 1 a biullldriLy CO irtam 

domain: PF00169 (PH) , 
oLutc- x\j . o f n. — v ci j.LiK=x . ye— Uj , 
N=l 


"no 


42 


1542 


Y71159 


Homo sapiens 


Human phosphodiesterase 
interacting protein, 
myomegal in . 


9415 


99 


1543 


X76092 


Homo sapiens 


DNA binding protein RFX3 


3327 


100 


1544 


AB015330 


Homo sapiens 


HRIHFB2 007 


631 


OK) 


1545 


AF198487 


Homo sapiens 


transcription factor LDP-lb 


2822 


100 


1546 


AF016417 


Caenorhabdi t 
is elegans 


Similar to BZIP transcription 
factor 


518 




1547 j 


X55885 


Homo sapiens 


KDEL receptor 


1106 


100 


1548 


AB035495 ' 


Carassius 
auratus 


ubiouit in-activat incr f»ny\/mf» 
El 


a J 0 


A O 


1549 


AL021707 


Homo sapiens 


dJ508I15.4 (KIAA0668) 


3688 


100 


1550 


AJ223978 


Bacil lus 
subtilis 




292 


42 


1551 


nc •!••* o A j 


uiuyupuiia 

melanogaster 


BCDNA. C^HOJJ 77 


822 


44 


1552 




ocni zosaccna 

romyces 

pombe 


putative mannosyl transferase 
involved in N-glycosylation 


435 


37 


1553 


AF079S27 


Mus musculus 


IER5 


691 


63 


"1554 


AB024291 


Rattus 
norvegicus 


acetoacetyl-CoA synthetase 


1099 


88 


1555 


Y44722 


Homo sapiens 


Human immune system molecule, 
ISMO-3. 


1780 j 


99 


1556 


AF116553 


Drosophila 
melanogaster 


antennal-specif ic short -chain 
dehydrogenase/reductase 


277 


32 


1557 


Y71056 1 


Homo sapiens 


Human membrane transport 


1975 


99 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESS TON 
NUMBER 




JJ&o^Kl Jtr I JLUIN 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 








nirofceiri MTT? D — 1 






1558 


Y71056 


Homo sapiens 


Human membrane transport 
protein, MTRP-1. 


1975 


Q Q 


1559 


Y710S6 


Homo sapiens 


Human membrane transport 
protein, MTRP-1 . 


1894 


97 


1560 


AF092050 


Mus mus cuius 


beta-l,3-N- 

ace tvlcrlucosaminv 1 t" rannfpra 


262 


44 


1561 


AL109827 


Homo sapiens 


dJ309K20 2 (acrosomal nrot-pin 

ACR55 (similar to rat sperm 
antigen 4 (SPAG4) ) ) 


1607 




1562 


AJ131890 


Homo sapiens 


DNA polymerase lambda 


3002 


100 


1563 


AL035424 


Homo sapiens 


u^iii tiL'iii . jl. \uy VGA UiUUCill 

similar to Drosophila Kelch 

nrnhpi n<a^ 




10 0 


1564 


AC002400 


Homo saDiens 


vjcuc ±j x, uuui» l wjLuii binuidrii.y 
to Ubicruitin bindiner pn^vmp 




100 


1565 


AC005306 


Homo sapiens 


R27216 1 


Ql Q 




1566 


AF000195 


Caenorhabdi t 
is elegans 


pnn t* a i n ^ «?i mi 1 nvi hv t- of am 
uuuuuiiiD J. il I X lu x. jl Ly to r lauTI 

domain: PF00169 (PH) , 

*J & — £• VS ■ U / A-J V Ul UC — X < ^ G v 3 / 

N=l 


jjU 


45 


1567 


AB033281 


Homo 
sapiens 


F-box and WD- repeats protein 
beta-TRCP2 isoform C 


2879 


100 


151*8 


D49473 


Mum 1111100111110 


cruncatea ioxtti or yoxi / 


1047 


78 


1569 


AK025270 


Homo sapiens 


unnamed protein product 


210 


91 


1570 


X75756 


nuiucj sapiens 


protein kinase C mu 


4797 


99 


1571 


AF145713 


H<™*m<^ aay>i one 


OL-tl X f - X 


2388 


100 


1572 


AE003831 


melanogaster 


L.ur±oft^b gene proauct 


180 


31 


1573 


AF074 603 


griseus 
subsp . 
griseus 


M nn p 
iiuxir 


205 


38 


1574 


U28993 


Caenorhabdi t 
is elegans 




JL*± 4 


27 


1575 


AF129507 


Homo sapiens 


transcription factor ICBP90 


287 


68 


1576 


X64878 


Homo fiarji pn«? 




2002 


100 


1577 


AF237711 


Drosophila 
melanotraRt" p v 


Diablo 


421 


54 


1578 


G00975 


Homo nanipnn 


ID NO: 5056. 


480 


100 


1579 


AF248744 


inm narvum 


thrombospondin- related 

auilcoXVc px.OUcXXl 


123 


33 


1580 


AL121782 


Homo sapiens 


dJ585114.2 (novel protein 
( t* r> s 1 a t* t on r>£ r»r>NA 

\ i»i.aiioxci L.XUH ul v— X>l'i V 

Em*AK000219) ) 


6*3 


100 


1581 


AF041B53 


Homo sapiens 


/^xnvoiii x. cxiux. jl _y iiiGiiixvCx. UxULClll 

KIF3A 




33 


1582 


AF025441 


Homo sapiens 


ODa-interact ina mrotein 


1198 


1UU 


1583 


AEO01BO3 


Thermotoga 
maritima 


glycerate kinase, putative 


349 


34 


1584 


AF252283 


Homo sapiens 


Kelch- like 1 protein 


3 973 ! 


100 


1585 


AF169675 


Homo 


icucinc"iicn repsac 
transmembrane protein FLRT1 


3494 


99 


1586 


AF118274 


Homo sapiens 


DNb- 5 


2 628 


97 


1587 


X79440 


Homo sapiens 


NADP+- dependent malic enzyme 


3167 


99 


1588 


X99802 


Homo sapiens 


ZYG homologue 


3 966 


99 


1589 


AF169803 


Homo sapiens 


f lavohemoprotein b54b5R 


2563 j 


100 


1590 


Y2986l 


Homo sapiens 


Human secreted protein clone 
cb98_4 . 


181 


47 


1591 ; 


Z2553S 


Homo sapiens 


nuclear pore complex protein 
hnupl53 


7567 


99 


1592 


X13293 


Homo sapiens 


B-myb protein (AA 1-700) 


3678 


99 


1593 


M74027 


Homo sapiens 


mucin 


242 


27 


1594 


AL139314 


Schizosaccha 
romyces 


hypothetical protein 


235 


54 
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TABLE 2 



SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 




SMITH- 
WATERMAN 

Cr<ODT? 


% 

IDENTITY 






pombe 








1595 


W78324 


Homo sapiens 


Fraoment of V11 una n a vc* t- 

protein encoded by gene 81. 




98 


1596 


Y94906 


Homo sapiens 


Human secreted nrotpin rlnnp 
rb649 3 protein sequence SEQ 
ID NO: 18. 


Z Z J o 


9 8 


1597 


AF174605 


Homo sapiens 


F-box protein Fbx25 


1408 




1598 


AB032254 


Homo 
sapiens 


bromodomain adjacent to zinc 
finger domain 2A 


9676 


98 


1599 


X73114 


Homo sapiens 


slow MyBP-C 


5568 


95 


1600 


X82200 


Homo sapiens 


gpStaf 50 


2305 


100 


1601 


Y00876 


Homo 
sapiens 


Human LAPH-1 protein 
sequence . 


1 l/Q 
J. J.H J? 


Q Q 


1602 


AJ223351 


Homo sapiens 


HIRA- interact" 1 no r»r*ot-c»Tn 1 


Z OZ J. 


Q Q 


1603 


AJ222801 


Homo sapiens 


neutral sphingomyelinase 


2268 




1604 


AJ222801 


Homo sapiens 


neutral finh i ncrom\/p» 1 in»<a£» 


It) u J. 


99 


.1605 


AF185576 


Mus musculus 


POZ/zinc finger transcription 
factor ODA-fi 


3435 


97 


1606 


AF093744 


Homo sapiens 


unknown 


J. J JL 


100 


1607 


A12142 


synthetic 
cons t it vie t 


IFN-pseudo- omega 2 


800 


98 


1608 


Y57949 


Homo sapiens 


Human trnnsmpmhrflriA i~>v<^*-o i « 
HTMPN-73. 


lobe 


100 


1609 


AF151044 


Homo sapiens 


HSPC210 


681 


97 


1610 


X15218 


Homo sapiens 


ski protein (AA 1 - 728) 


3765 


100 


1611 


Y08200 


Hnmn nani pnn 
nwniv cjay J.C1JB 


a. dxj y cr dnyige tcinyx 

t* TTF\ n q f P ya e> <a 


2976 


100 


1612 


AF220560 


Homo sapiens 


B/K protein 


2486 


99 


1613 


AC004481 


Arabi rioriRi k 

thaliana 


iiuuuiin iitvc protein 


3 71 


26 


1614 


Y09501 


Homo Qani pnn 


isi^jjn- ^y cocnr oiue—Db reductase 


1607 


100 


1615 


Y15521 


UffMTH^i onm ana 

nuiiiu uapiena 


start position 1 


3150 


97 


1616 


AJ010750 


Ra 1 1 u 3 

nnn/pai r*ii«3 
iiwj. v cy iv>uq 


Castration induced prostatic 
apopcosis reiacea procem-i, 
(CIPAR-1) 


890 


62 


1617 


X58079 


Homo sapiens 




4 81 


100 


1618 


Y64678 


Homo 
sapiens 


Membrane -bound protein 
PRO1009. 


967 


100 


1619 


AJ242973 


Homo RSDiprm 


pcyLiuc nit; Lii luuine suiioxiue 
reductase 


929 


100 


1620 


AF150733 


Homo sapiens 


J^t-LJ w A. *B pLULClil 


288 


100 


1621 


AJ007509 


Hnmn cani oho 


riio- 1> oKua -associated protein 


4646 


98 


1622 


X64177 


Homo sapiens 


metallothionein 


380 


100 


1623 


AE001045 


Archaeoglobu 

f ill eH riii q 


A. fulgidus predicted coding 
region ftruoDj 


240 


36 


1624 


AL355013 


O ZUoU l— U. 1 let 

r omy ce s 
pombe 


uiicocnojiuriai carrier protein 


403 


34 


1625 


Y66746 


Homo 
sapiens 


Membrane -bound protein 

PRO! 1 9fl 


1184 


100 


1626 


D90053 


Sus scrofa 




863 


100 


1627 


Y35954 1 


Homo satjiens 


liALenucu iiu.mei.il Becretea 

vs v*o hpi n «^mt<s»nr»^ QT70 Tn MO 

203. 


756 


100 


1628 


AL031775 


Homo sapiens 


dJ30M3.2 (novel protein) 


470 


1D0 


1629 


AF132484 


Mus musculus 


unknown 


286 


68 


1630 


AF017096" 


Drosophila 
melanogaster 


similar to C. elegans 
R10H10.6 and S. cerevisiae 
YD8419 .03c 






1631 


X03077 


Homo sapiens 


lactate dehydrogenase-A 


1704 


100 


1632 


AF151084 


Homo sapiens 


HSPC250 


763 


100 


163* 


AJ001874 


Homo sapiens 


orf 


255 


97 


1634 


AC012187 


Arabidopsis 
thaliana 


Contains weak similarity to 
GATA-6 DNA- binding protein 
gb|H36l35, gb|Z26200 come 
from this gene . 


143 


38 
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TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

l.l 7\ *pciT3M TV XT 

WAl RKMAN 


IDENTITY 


1635 


AF026246 


Homo sapiens 


HERV-E integra se 


411 


90 


1636 


Y50943 


Homo sapiens 


Human adult" hrAin r»nN2X rlnriA 

ve8_l derived protein. 


hoc 
xx<£b 


95 


1637 


AF134593 


Homo sapiens 


L-Dinprol i r arHri nyiriaco 

*J JpApct.UJ, X V* CIL 1U UAlUabc 


*5 ft c a 
z ub o 


99 


1638 


AJ238247 


Mus musculus 


putative phosphatase subunit 


1943 


96 


1639 


Y94942 


nun iu bapiciio 


numan secreted protein Clone 
yk25l_l protein sequence SEQ 

in NO • QD 

XL/ IV u . ? u . 


1320 


100 


1640 


AF235030 


Homo sapiens 


BM88 antigen 


766 


99 


1641 


AF233288 


Drosophila 

rr\A "1 ^ r>r°t ^ d5 ^ v 
Ulc JLdliUydo Lcl 


WDS 


358 


26 


1642 


M19351 


Hud IUUSCU1U8 


, 

immunoglobulin heavy chain 
binding protein 


145 


34 


1643 


Y70452 


Homo sapiens 


Human membrane channel 
protein-2 (MECHP-2) . 


1352 


100 


1644 


AF176520 


Ml i a miiapnl iia 


WD repeat - containing F-box 
protein FBW5 


2676 


88 


1645 


W67816 


Homo sapiens 


Human secreted protein 
encoded by gene 10 clone 
HCEMU42. 


1156 


100 


1^46 


X67155 


noroo Sapiens 


mitotic kinase-like protein-l 


4456 


99 


1647 


MS3180 


nuniu Sapiens 


threonyl - t RNA synthetase 


1040 


61 


1*48 


Y&-7342 


Homo sapiens 


Human signal peptide 
containing protein HSPP-119 
SEQ ID NO: 119. 


1566 


93 


1649 




Homo sapiens 


Tumor necrosis factor 
receptor 1 death domain 
ligand (clone 3TW) . 


4137 


100 


1650 


AC007136 


Homo sapiens 


Putative map kinase 
interacting kinase 


056 


99 


1651 


AB015346 


Homo sapiens 


Epsl5R 


4464 


99 


1652 


AL 1 6 1 S 7 6 


thai -i ana 
L-iict x x alia 


putative protein 


1341 


48 


1653 


AC005313 


Arabidopsis 

thai i ana 


putative calmodulin 


288 


28 


1654 


AL031428 


Homo sapiens 


dJ184J9.1 (KIAA0601 protein) 


3526 


100 ] 


1655 


AL031428 


Homo sapiens 


dJ184J9.l (KIAA0601 protein) 


3526 


100 


1656 


v X f J> ± U 


Dictyos tel iu 
m discoideum 


myoM 


297 


32 


1657' 


Y28919 


Homo 
sapiens 


Human regulatory protein 
HRGP-5. 


2251 


99 


1658 




Homo sapiens 


TPA inducible protein 


2744 


98 


1659 


U76846 


Arabidopsis 

LitaJL land 


ubiquitin-specif ic protease 


137 


35 


1660 


AL078627 


Schizosaccha 

romyces 

pombe 


actin-like protein; (2 actin 
domains ) 


320 


34 


1662 


X52022 




r^nl 1 a/ton n-na \f T T wiV. -» "5 

coiiayen type vi, aipna 3 
chain 


16274 


99 


1663 


AF3 0064 8 


Homo 
sapiens 


guanine nucleotide binding 
protein beta subunit 4 


1811 


100 


1664 


AF214736 


Homo sapiens 


EH domain containing protein 
2 


2774 


100 


1665 


Z48613 


Saccharomyce 
6 cerevisiae 


unknown 


138 


26 1 


1666 


AF177385 


Homo 

t-^ X CilO 


cytochrome c oxidase assembly 
protein isoform 2 


1395 


99 


1667 


AC007842 


Homo sapiens 


BC3 31191_1 


1581 


47 


1668 


S67513 


Borna 
disease 
virus BDV, 
WT-1, Halle 
Bl/91, horse 
brain, field 
isolate. 
Peptide, 3 70 


p40 


397 


43 
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TABLE 2 
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SEQ 
ID 

NO: 


ACCESSION 
NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 

SCORE 


% 

XUEtVi 1111 






aa 








1669 


Z99753 


Schi zosaccha 

rorayces 

pombe 


Dutative NOLI -NOP2 - sun f amilv 
nucleolar protein 


56 9 


A *7 


1670 


G03130 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 7211. 


427 




1671 


M96625 


Gallus 
gallus 


cardiac muscle tensin 


1185 


54 


1672 


AF1744 82 


Homo sapiens 


poly comb 3 


2005 


Q Q 


1673 


Y51B46 . 


Homo sapiens 


Human 18.1 homolog protein 
fragment . 


233 


29 


1674 


AF255334 


Homo sapiens 


EXP35 


152 


29 


1675 


Y94867 


Homo 
sapiens 


Human nroh^in cl r>nA HPlfi^fi*} 


i no 

± V 3 


J u 


1676 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2 . 


3043 


99 


1677 


Y25712 


Homo sapiens 


Human secreted protein 
encoded from gene 2 


1580 


91 


1678 


AF163151 


Homo sapiens 


dentin sialophosphoprotein 
precursor 


170 


17 


1679 


AF163151 


Homo tj^ni pn<i 


Hp n h H r~\ c 3 1 ^YY"i Vl ^OY^K v~ r°\ t" Q A r% 
LiCiiLj.il bxalupilOSpnoprOLClIl 

precursor 


17 0 


17 


1680 


AK024453 


Homo sapiens 


FLJ00045 protein 


1349 


100 


1681 


AF019236" 


Dictyos teliu 
m discoideum 


TipD 


""ft vT 

fa J. J 


34 


1682 


AJ243459 


Leishmania 
major 


QfotpnnhrmnVinal upsn 
Lcupiiu ^ir A * j. y Lp clii 




26 


1683 


Z69369 


Schizosaccha 

romyces 

pombe 


putative GTP-binding protein 


560 


46 


1684 


X94910 


Homo sapiens 


ERp28 




1UU 


1685 


AF286475 


Takifugu 
rubripes 


retinif i ^ niampnt"nt!a G'V'Pacze* 

regulator-like protein 




19 


1686 


AF191298 


Homo sapiens 


vacuolar sorting protein 35 


4087 


100 


1687 


AJ275986 


Homo sapiens 


transcription factor 


2958 


inn 


1688 


AJ275986 


Homo sapiens 


transcription factor 


1886 


ft ft 


1689 


X07311 


Drosophila 
melanogaster 


heat shock protein 


138 


43 


1690 


AF240463 


Rattus 
norvegicus 


LISI- interact ing protsin 
NUDE1 


13 83 


83 { 


1691 


AJ272078 


Homo sapiens 


APOBEC- 1 shimulatina ornhpin 




C Q 

bo 


1692 


AJ272079 


Homo sapiens 


APOBEC-1 stimulating protein 


1336 


60 


1693 


AF177942 


Xenopus 
laevis 


nciuaijxii Y*t w \J 


1 CCA 


~F~c 

DO 


1694 


AF263539 


Homo sapiens 


arginine N-methyl trans f erase 


1774 


100 


1^95 


AF222689 


Homo 
sapiens 


protein arginine N— 

methyl transferase 1- variant 2 


Hoc 


HI 


1696 


AK000193 


Homo sapiens 


unnamed protein product 


1060 


100 


1697 


AB041035 


Homo sapiens 


kidney superoxide-producing 
NADPH oxidase 


3122 


100 


1698 


AB04103S 


Homo sapiens 


kidney superoxide-producing 
NADPH oxidase 


2181 


100 


1699 


AF025772 


Homo sapiens 


C2H2 zinc finger protein 


488 


54 


1700 


Y44676 


Homo sapiens 


Human ARF-Relafced Protpin-1 
(HARP-1) . 




9 / 


1701 


AK022407 


Homo sapiens 


unnamp ri nrnhp i n nrnHnrt - 

uiiiiuiu^ia ylULClll ^luuULL 


J J. J 


98 


1702 


AB024^74 


Homo sapiens 


GTP-binding like protein 2 


1172 


100 


1703 


AF055078 


Homo sapiens 


zinc finger protein 42 


421 


52 


1704 


AF198092 


Mus musculus 


RP42 


1057 


77 


1765 


AE003573 


Drosophila 
melanogaster 


CG12474 gene product 


161 


33 


1706 


AB036345 


Drosophila 
melanogaster 


aquaporin 


164 


24 


1707 


Y55927 


Homo sapiens 


Human STLK2 protein. 


2146 


100 


1708 


U27121 


Danio rerio 


G12 


212 


47 


1709 


AL391710 


Arabidopsis 


putative protein 


505 


50 
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TABLE 2 



PCT/US00/34263 



ID 

NO : 


/iL.L.HC5& 1 ON 

NUMBER 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


% 

IDENTITY 






thaliana 








1710 


B01311 "~ 


Homo sapiens 


Human PR0241 polypeptide. 


1649 


97 


1711 


U40750 


Mus musculus 


formin binding protein 30 


4561 


85 


i / 


um i i 1 o 
Au U 11 J. lb 


Mus musculus 


skeletal muscle and cardiac 
protein 


1490 


89 


1713 




Hoino 
sapiens 


membrane-associated nucleic 
acid binding protein 


4416 


99 


1714 




Homo 
sapiens 


membrane-associated nucleic 
acid binding protein 


2960 


100 


1715 


U0 82 27 


Rat tus 
norvegicus 


Ras-related protein 


511 


51 


1716 


API Gfi7Qc; 


Rat tus 
norvegicus 


schlaf en-4 


1129 


44 


1717 


aft 96"^ rid. 


Homo sapiens 


SUMO- 1-specif ic protease 


5B04 


99 


1718 


AL355737 


Homo sapiens 


HMG20A 


1782 


100 


1719 




Halocynthia 
roretzi 


HrPET-1 


1069 


46 


1720 


2iT?m 11T7 

/ir u / IjX / 


Mus musculus 


C0P9 complex subunit 7b 


1297 


97 


1721 


™ Z ' 


Homo sapiens 


heyl protein 


1681 


99 


MOO 




Homo sapiens 


Human secreted protein, SEQ 
ID NO : 6063 . 


718 


100 


1723 


AL032643 


Caenorhabdit 
is elegans 


similar to Uncharacterized 
protein family UPF0034, 


825 


41 


1724 


G01972 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 6053. 


586 


92 


lr) r 


Y94441 


Homo 
sapiens 


Human Adipose Specific 
Protein 1. 


1231 


100 


172 6 


ftf / Jjfi^ J 


Homo sapiens 


CGI-201 protein 


4397 


99 


1727 


AF183426 


Homo sapiens 


HT004 protein 


1816 


99 


1728 


m no d» 


Bos taurus 


neurocalcin 


1002 


99 




iii oozy 


Gallus 
gallus 


tensin 


1411 


84 


i Tin 


•7*7 "3 V! 1 1 
4 / J 4 Z J 


Caenorhabdit 
is elegans 


cDNA EST EMBL:Z14908 comes 
from this gene-cDNA EST this 
gene 


233 


41 


1732 




Homo sapiens 


PRO01Q5 


470 | 


30 


1733 


AJ277724 


Homo sapiens 


histone deacetylase 8 


2015 


"100 


1734 


G04050 


Homo sapiens 


Human secreted protein, SEQ 
ID NO: 8131. 


503 


95 


1735 


D45913 


Mus musculus 


leucine-rich-repeat protein 


3531 


94 


1736 


AF096709 


Drosophila 
virilis 


failed axon connections 
protein 


276 


32 


~\ nil 


AF195120 


Homo sapiens 


dynactm p62 subunit 


2417 


99 


1738 


L15314 


Caenorhabdit 
is elegans 


contains similarity to Pfam 
family PF01772 N»l 


206 


37 


1739 


X54618 


Listeria 

monocytogene 

s 


phosphadidylinositol specific 
phospholipase C 


134 


27 


1740 ' 


AL031658 


Homo sapiens 


dJ310O13.4 (novel protein 
similar to predicted C, 
elegans an C. intestinalis 
proteins) 


123 


31 


1741 


Y35924 


Homo sapiens 


Extended human secreted 
protein sequence, SEQ ID NO. 
173 . 


1013 


99 


174 2 


AC013354 


Arabidopsis 
thaliana 


F15H18.15 


202 


32 


1743 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08. 


1932 


59 


1744 


W75771 


Homo 
sapiens 


Human GTP binding protein 
APD08. 


1854 


61 


1745 


AF221098 


Homo 
sapiens 


Ral guanine nucleotide 
exchange factor RalGPSIA 


1224 


70 


174* 


Y99372 


Homo sapiens 


Human PRO14 3 0 (UNQ73 6) amino 
acid sequence SEQ ID NO: 116. 


1332 


99 


J./47 j Y94294 | Homo sapiens 


Human coenzyme A-utilising 


842 


100 



188 



WO 01/53312 



PCT/US00/34263 



TABLE 2 



SEQ 
ID 
NO: 


ACCESSION 

MTHVTT3 GD 


SPECIES 


DESCRIPTION 


SMITH- 
WATERMAN 
SCORE 


IDENTITY 








enzyme CoAEN-2. 






1748 


AK024436 


Homo sapiens 


FLJ00026 protein 


1619 


100 


1749 


AE000877 


Methanobacte 
rium 

ophicum 


conserved protein 


231 


36 


1750 


AF101361 • 


Drosophila 
melanogaster 


Abnormal X segregation 


193 


33 


1751 


Y15067 


Homo sapiens 


ZNF232 


889 


100 


1752 


AF251038 


Homo sapiens 


GAP- like protein 


822 


100 


1753 


AC003093 




UXX£> iLRUb- BINDING PROTEIN; 
similarity to P22Q59 


352 


57 


1754 


X69089 




idjau protein 


5703 


99 


1755 


AL049795 


Homo sapiens 


dJ62 2L5.3 (novel protein) 


1039 


100 


1756 


AL031393 


Hnmn oani one 


aj/jjijib.i (Zinc-ringer 
protein) 


2765 


100 


1757 


AB04 0672 

AD Ut VO f & 


noino Sapiens 


UDP-GalNAc: polypeptide N- 
acetylgalactosaminyltransf era 
se 


2020 


99 


1758 


AL022238 


Homo sapiens 


dJ1042Kl0.4 [novel protein) 


776 


43 


1759 


AF117653 


Homo sapiens 


double homeobox protein 


375 


54 


17£0 ' 


VI TflfC 

i x J. u o b 


Homo sapiens 


hNop56 


2959 


99 


1761 


AL049712 


Homo sapiens 


dJ686C3.2 (nucleolar protein 
hNop56) 


2595 


99 


1762 


AC002394 


Homo 
sapiens 


Gene product with similarity 
to dynein beta subunit 


1542 


51 


1763 


AF169017 


Homo sapiens 


formiminotransf erase 
cyclodeaminase 


877 


100 


1764 


U91541 


Homo sapiens 


human formiminotransf erase 
cyclodeaminase (f ted) protein, 
carboxy- terminal end 


596 


100 


1765 


anni lice 


Bacillus 

Vial Arinv^ y-t 


YlqF 


350 


34 


176" 6" 


Y38421 


U^Hrtrt e»ii>^T Ann 

numu Bapieno 


Human secreted protein 
encoded by gene No . 36. 


145 


71 


1767 


AC009176 


thaliana 


putative ribulose-l, 5- 
bisphosphate 

uttiuoAyidbe/oxygenase small 
ouuuuxl ih nits uiiyj. u LdiJo t erase i 


216 


27 


1768 


AK000647 


Homo sapiens 


uiiiictuicu pruccin proexuen 


737 


99 


1769 


AJ238982 


Homo sapiens 


VNN3 protein 


2665 


99 


1770 


U73522 


Homo sanipns 


AMSH 


1214 


56 


1771 


U89435 


Mus musculus 


till JVI JUWJ 1 


829 


86 


1772 


S70011 


Rattus sp. 


tricarboxylate carrier 


1604 


95 


1773 


AL035086 


Homo sapiens 


dJ44A20.2 (novel protein) 


2036 


100 


1774 


Y99426 


Homo sapiens 


Human PRO1604 (UNQ785) amino 
acid sequence SEQ ID NO: 308. 


1057 


99 


1775 


AF110330 


Homo sapiens 


glutaminase 


3146 


100 


1776 


AJ269 , ?7 Q 


Homo sapiens 


glycerol 3 -phosphate permease 


2787 


100 


1777 


Z81579 


Caenorhabdit 
is elegans 


cDNA EST yk76fl.5 comes from 
this gene 


232 


31 


1778 


AYn079 7 Q 


Homo sapiens 


monooxygenase X 


1875 


99 ! 


1779 




Schizosaccha 

romyces 

pombe 


oxysterol -binding protein 
family 


644 


38 


1780 


AF254260 


Homo sapiens 


tuftelin 1 


177Q 

jl / z y 


100 { 


1781 


L07924 


Mus musculus 


guanine nucleotide 
dissociation stimulator 


247 


50 


1782 


AF295773 


Homo i 
sapiens 


ral guanine nucleotide 
dissociation stimulator 


142 


49. 


1783 


AK024475 


Homo sapiens 


FLJ00068 protein 


4333 


100 


1784 


AK024475 


Homo sapiens 


FLJ00068 protein 


3996 


93 


1785 


G03933 


Homo sapiens 


Human secreted protein, SEQ 
ID NO; 8014. 


570 


100 


1786 


S82637 


Homo sapiens 


Ig lambda-like gene/beta- 


247 


100 
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TABLE 2 



SEQ 


ACCESSION 


SPECIES 


DESCRIPTION 


SMITH- 


% 


ID 


NUMBER 






WATERMAN 


IDENTITY 


NO: 








SCORE 










glucuronidase exon 11 homolog 







TRADOCS : 1 4 1 6280. 1 (%CT40 1 ! . DOC) 
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TABLE 3 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 


2 


BL00240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 8.250e- 
12 157-181 


3 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 B.085e- 
13 358-381 


4 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.400e- 
10 1129-1146 BL00028 
16.07 1.257e-09 820- 
837 


5 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins. 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


6 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 8.920e- 
33 413-450 BL00023 
24.31 4 .545e-27 353- 
390 


7 


BL00023 


Type II fibronectin 
collagen- binding domain 
proteins . 


BL00023 24.31 B.920e- 
33 413-450 BL00023 
24.31 4.545e-27 353- 
390 


8 


BL00023 


Type II fibronectin 
collagen -binding domain 
proteins . 


BL0OO23 24.31 8.920e- 
33 413-450 BL00023 
24.31 4,545e-27 353- 
390 


9 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 5.119e- 
09 863-917 


10 


PR00464 


E- CLASS P450 GROUP II 
SIGNATURE 


PR00464D 17.40 6.1B2e- 
12 294-312 PR00464G 
12.41 4.231e-ll 377- 
393 


11 


PR00734 


GLYCOSYL HYDROLASE 
FAMILY 7 SIGNATURE 


PR00734I 11.46 4.29£e- 
09 502-520 


12 


PF00023 


Ank repeat proteins . 


PF00023B 14.20 G.SOOe- 
10 89-99 PF00023B 
14.20 2.636e-09 56-66 


14 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3.848e- 
09 79-113 


15 


PR00208 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 9.8^8e- 
10 517-535 PR00208A 
12.59 2.233e-09 520- 
538 


17 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13 .92 8.200e- 
14 282-295 PDQ0066 
13 .92 9.400e-14 477- 
490 PD00066 13.92 
6.500e-13 505-518 
PD00066 13.92 9.500e- 
13 254-267 PD00066 
13 .92 1.429e-12 393- 
406 PD00066 13.92 
6.571e-12 421-434 


18 


BL0OB45 


CAP-Gly domain proteins. 


BL00845 1^.43 2.200e- 
25 55-80 


20 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 287-329 


21 


BL00487 


IMP dehydrogenase / GMP 
reductase proteins. 


BL00487E 16.12 5.737e- 
26 154-199 BL00487F 
18.79 8.984e-22 235- 
276 BL00487G 26.82 
4.082e-12 348-390 


22 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


23 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 3.250e- 
26 302-333 


25 


BL00115 


Eukaryotic RNA 
polymerase II 
hep tapep tide repeat 
proteins. 


BL00115T 8.45 7.273e- 
29 1208-1242 BL00115Q 
18.08 2,776e-21 953- 
983 BL00115Y 11.86 
8.0D0e-17 1604-1650 
BL00115M 19.19 8.130e- 
16 731-774 BL00115H 
14.34 9.392e-16 463- 
496 BL00115A 15.44 
7.414e-15 43-82 
BL00115R 6.50 6.128e- 
14 983-1010 BL00115J 
16.71 9.289e-14 591- 
617 BL00115I 8.33 
4 .336e-13 535-590 
BL00115L 12.25 5.939e- 
13 662-694 BL00115G 
11.65 6.011e-13 435- 
463 BL00115K 15.03 
3.417e-10 617-659 
BL00115O 16.76 5.805e- 
10 863-913 BL00115P 
11.54 7.538e-10 913- 
953 BL00115S 18.24 
7.968e-10 1010-1052 
BL00115U 10.34 4.475e- 
09 1242-3.265 1 


26 


UT.nn/ion 

DUUUy 4, U 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420A 20.42 4.109e- 
11 81-110 BL00420A 
20.42 8.820e-10 84-113 


27 


BL00050 


Ribosomal protein L23 
proteins . 


BL00050A 23.71 9.250e- 
27 94-127 BL00050B 
14.81 8.125e-12 133- 
147 


28 


PR0O925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 


PR00925B 3.73 3.089e- 
10 41-54 


29 


PF00756 


Putative esterase. 


PF00756C 14.12 1.108e- 
09 486-516 


32 


BL00557 


FMN- dependent alpha - 
hydroxy acid 
dehydrogenases proteins . 


BL00557D 17 . l£ 5.06"5e- ' 
37 274-316 BL00557A 
35.08 8.909e-29 24-73 
BL00557C 15.59 l.OOOe- 
28 227-257 BL00557B 
21.27 8.898e-22 130- 
169 


34 




SHC PH0SPHOTYR0SINE 
INTERACTION DOMAIN 
SIGNATURE 


PR00629E 9.90 5.886e- 
35 299-328 PR00629F 
10.95 8.364e-32 334- 
361 PR00629B 13.66 
3.786e-27 224-247 
PR00629A 13.45 8.364e- 
21 206-222 PR00S29C 
3.80 4.000e-12 249-261 
PR00629D 12.45 3.739e- 
11 276-286 


35 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


ruux^/UA LI .£2. l.OOOe- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19.54 3.455e-30 137- 
166 


36 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFIN. 


PD01270A 17.22 1 . 000e- 
40 39-79 PD01270B 
22.18 2.875e-38 94-131 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* ! 








PD01270D 24.66 3.700e- 
34 171-207 PD01270C 
19,54 3.455e-30 137- 
166 


37 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-298 


38 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-298 j 


39 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412C 10.28 9.241e- 
10 264-298 


40 


PRO0380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380B 12.64 7.366e- 
14 342-360 PR00380C 
13. IB 6.927e-13 375- 
394 PR00380D 9.93 
2.180e-12 429-451 
PR00380A 14.18 5.154e- ! 
12 143-165 


44 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 239-290 BL00345A 
13 .96 2.452e-14 204- 
223 


45 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 215-266 BL00345A 
13.96 2.452e-14 180- 
199 


46 


DM01551 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER. 


DM01551A 15.63 3.538e- 
26 172-202 DMC1551C 
14.62 3.571e-17 232- 
252 DM01551B 8.84 
4.750e-ll 214-226 


47 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.6*£ 9.328e- ' 
11 246-260 


48 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01O66 19.43 4 .231e- 
33 6-45 


50 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 994-1019 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1020-1042 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 8.269e-10 302-312 


51 


BL00972 


Ubiquitin carboxyl- 
terrainal hydrolases 
family 2 proteins. 


BL00972D 22.55 7.750e- 
19 990-1015 BL00972A 
11.93 7.120e-18 216- 
234 BL00972E 20.72 
9.471e-14 1016-1038 
BL00972C 16.48 7.000e- 
13 360-375 BL00972B 
9.45 B.269e-10 302-312 


52 


BL01115 


GTP- binding nuclear 
protein ran proteins . 


BL01115A 10.22 3.063e- 
14 10-54 


53 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 8.500e- 
17 20-38 PR00988F 
12.23 7.828e-15 196- 
210 PR00988C 13.64 
6.108e-14 104-120 
PR00988E 8.27 3.872e- 
11 174-186 PR00988D 
5.95 6.878e-10 160-171 
PR00988B 11.60 2.915e- 
09 57-69 


55 


PR00762 


CHLORIDE CHANNEL 
SIGNATURE 


PR00762C 9.29 4.682e- 
21 294-314 PR00762D 
11.29 4.103e-19 509- 
530 PR00762A 14.22 
9.333e-18 199-217 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








PR00762F 15.12 3.100e- 
16 563-583 PR00762B 
12.12 6.063e-l6 230- 
250 PR00762E 12.07 
2.2B6e-15 545-562 
PR00762G 14.13 6.276e- 
13 601-616 


56 


BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 8.800e- 
10 153-203 


58 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors. 


PF00791B 28.49 2.049e- 
10 1080-1135 




Pr OU /91 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 2.049e- 
10 1062-1117 


61 


PD01929 


KINASE TYPE RESISTANCE 
ANTIBIOTIC TRANSFERASE 
AM. 


PD01929E 10.76 9.018e- 
09 206-221 


fift " " 

D O 


VKUUjbU 


C2 DOMAIN SIGNATURE 


PR00360A 14 .59 7.395e- 
09 680-693 




PRO 0360 


C2 DOMAIN SIGNATURE 


PR00360A 14.59 7.395e- 
09 670-683 


70 


PF00651 


BTB (also known as BR- 

/ wry i 1 \ J » « 

C/Ttk) domain proteins. 


PF00651 15.00 8.714e- 
10 51-64 


72 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL . 


DM00179 13.97 5.304e- 
09 108-118 


73 


BL00239 


Receptor tyrosine kinase 
class II proteins. 


BL00239B 25.15 7.075e- 
12 118-166 


74 


BL00790 


Receptor tyrosine kinase 
class V proteins . 


BL00790N 13.25 6.116e- 
10 93-120 


76 


DM00471 


0 PROKARYOTIC DNA 
TOPOISOMERASE I. 


DM00471A 11.73 9.357e- 
13 53-66 DM00471B 
8.45 4.857e-12 70-81 


80 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSERINE. 


PD02876C 8.80 2.723e- 
13 223-236 PD02876D 
12.13 2.588e-12 334- 
351 


81 


PD02876 


DECARBOXYLASE 
PHOSPHATIDYLSERINE . 


PD02876C 8.80 2.723e- 
13 282-295 PD02876D 
12.13 2.588e-l2 393- 
410 


83 


BL00708 


Prolyl endopeptidase 
family serine proteins. 


BL00708B 24.91 7.197e- 
12 570-601 


84 


PR00014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014C 15.44 8.043e- 
09 985-1004 


66 


PR00678 


PI3 KINASE P85 
REGULATORY SUB UN IT 
SIGNATURE 


PR00678H 9.13 1.379e- 
09 246-269 




PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 8.200e- 
09 264-279 PR00320B 
12.19 8.650e-09 264- 
279 


93 ] 


BL00455 


Putative AMP-binding 
domain proteins . 


BL00455 13.31 2.588e- 
14 316-332 


95 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 123-154 


96 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.000e- 
10 212-243 


-» / 


PKU0081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.318e- 
13 134-146 PR00081A 
10.53 2.500e-12 54-72 


98 


.PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 5.500e- 
24 401-423 PR00380D 
9.93 7.188e-20 613-635 
PR00380B 12.64 7.517e- 
16 529-547 PR00380C 
13.18 2.756e-13 560- 
579 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


102 


PR00300 


ATP -DE PENDENT CLP 
PROTEASE ATP -BINDING 
SUBUNIT SIGNATURE 


PR00300A 9.5£ 7.545e- 
14 289-308 


104 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins. 


BL00479B 12.57 6.786e- 
18 298-314 BL00479A 
19.86 4.913e-16 155- 
178 BL00479A 19.86 
4.300e-13 272-295 
BL00479B 12.57 6.294e- 
12 181-197 


106 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 8.013e- 
12 43-83 


107 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.^0 5.000e- 
16 403-416 


108 


BL00191 


Cytochrome b5 family, 
heme-binding domain 
proteins . 


BL00191K 17.38 4.951e- 
27 238-282 BL0019U 
11.37 6.447e-17 182- 
204 


109 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 4.938e- 
37 8-47 


110 


BL01138 


Scorpion short toxins 
proteins . 


BL01138A 10.96 8.297e- 
10 38-50 


113 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 5.800e- 
23 156-187 BL00107B 
13.31 9.100e-14 225- 
241 


117 


BL00214 


Cytosolic fatty-acid 
binding proteins . 


BL00214B 26.51 l.OOOe- 
17 46-91 BL00214A 
21.17 7.052e-ll 5-31 


118 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 8.££0e- 
13 36-67 


119 


PR00529 


GONADOTROPH IN RELEASING 
HORMONE RECEPTOR 
SIGNATURE 


PR00529C 11.03 7.506e- 
10 158-177 


120 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


121 


PR00320 


G-PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 9.400e- 
09 80-95 


127 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 7.158e- 
13 216-241 


128 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032C 6.14 3.195e- 
12 147-157 BL01032H 
11.25 5.680e-ll 318- 
331 BL01032G 8.33 
8.9?2e-ll 282-296 
BL01032I 10.42 8.902e- 
09 379-389 


129 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 6.694e- 
26 28-64 


130 


PR00990 


RIBOKINASE SIGNATURE 


PR00990B 12.32 9.534e- 
15 47-67 PR00990A 
16.23 5.S00e-14 20-42 
PR00990C 12.62 2.412e- 
09 119-133 


133 


BIj00880 


Acyl-CoA-binding 
protein. 


BL00880 17.52 5.576e- 
26 72-122 


134 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 9.308e- 
14 18-37 


135 


PR00215 


NEUROM 0 DUL I N SIGNATURE 


PR00215C 13.98 6.779e- 
10 475-496 


136 1 


BL0.1310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 2.432e- 
29 71-107 


140 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.882e- 
14 214-231 BL00028 
16.07 9.471e-14 102- 
119 BL00028 16.07 
2.800e-13 18-35 



195 



WO 01/53312 PCT/US00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS * 








BL00028 16.07 S7500e- 
13 74-91 BL00028 
16.07 9.100e-13 186- 
203 BL00028 16.07 
8.043e-12 46-63 
BL00028 16.07 8.435e- 
12 130-147 BL00028 
16.07 9.217e-12 270- 
287 BL00028 16.07 
6.192e-ll 242-259 j 
BL00028 16.07 4 . OOOe- \ 
10 158-175 


141 


BL00501 


Signal peptidases I 
serine proteins. 


BL00501D 16.69 9.538e- 
14 113-133 BL00501C 
9.61 8.688e-10 89-101 


143 


BL01020 


SARI family proteins. 


BL01020C 15.35 7.722e- 
20 79-130 


146 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.400e- 
25 335-374 


149 


BL0012G 


S'S'-cyclic nucleotide 
phosphodiesterases 
proteins . 


BL00126C 22.07 1.450e- 
25 509-550 BL00126E 
35.22 3.951e-16 654- 
709 BL00126D 25.50 
1.360e-15 565-604 
BL00126B 15.20 8.200e- 
11 483-495 BL00126A 
27.56 8.269e-ll 442- 
479 


151 


BL00632 


Ribosomal protein S4 
proteins . 


BL00632 23.79 5.271e- 
20 106-149 


154 


BL00S59 


Eukaryotic molybdopterin 
oxidoreductases 
proteins . 


BL00559I 13.63 5.304e- 
19 29-58 BL00559K 
13.17 2.957e-18 172- 
199 BL00559J 19.63 
8.385e-13 99-151 
BL00559L 13.60 5.814e- 
12 241-259 


155 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.692e- 
13 13-35 


157 


BL00406 


Actins proteins. 


BL00406D 12.58 2.547e- 
18 275-330 BL00406A 
9.95 5.776e-16 15-50 
BL00406B 5.47 7,429e- 
12 69-124 BL00406C 
6.75 9.682e-12 128-183 


160 


BL00132 


Zinc carboxypeptidases, 
zinc-binding region l 
proteins. 


BL00132A 26.07 7.000e- 
14 22-63 BL00132C 
21.35 3.466e-12 104- 
145 


i£5 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 9.043e- 
13 139-158 


168 


BL00362 


Ribosomal protein S15 
proteins . 


BL00362 24.67 9.700e- 
15 129-172 


169 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BLO0O39D 21.67 l.OOOe- 
35 640-686 BL00039A 
18.44 1.964e-13 212- 
251 BL00039B 19.19 
4.553e-13 378-404 
BL00039C 15.63 8.773e- 
12 465-489 


175 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.721e- 
12 14-36 


178 


BL01310 


ATP1G1 / PLM / MAT 8 
family proteins. 


BL01310 14.74 2.432e- 
29 133-169 


179 


PD01066 


PROTEIN 2IWC 1 LINGER 
ZINC- FINGER METAL - 


PD01066 19.43 5.455e- 
36 6-45 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






BINDING NU. 




1B0 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007B 14.16 7.429e- 
20 160-180 PR00007A 
19.33 4.938e-19 133- 
160 PR00007C 15.60 
1.225e-15 206-228 
PR00007D 9.64 6. 88 Se- 
ll 238-249 


181 


RT.n n n*5 *7 

OJjU U \J £ 1 


1 Homeobox 1 domain 
proteins . 


BL00027 26.43 9.526e- 
24 280-323 


182 


m\n n no i 


1 H ome obox 1 doma i n 
proteins . 


BL00027 26.43 9.526e- 
24 263-306 


183 


BL0002 7 


■ Homeobox 1 domain 
proteins . 


BL00027 26.43 9.526e- 
24 280-323 


184 


BL00027 


'Homeobox* domain 
proteins . 


BL00027 26.43 9.526e- ~ 
24 263-306 


188 


PR00929 


AT- HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 460-471 


189 


PR00929 


AT - HOOK- LIKE DOMAIN 
SIGNATURE 


PR00929C 5.26 3.328e- 
09 440-451 


190 ' 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins. 


BL00383F 15.51 7.188e- 
17 666-682 BL00383A 
13.34 8.714e-17 162- 
177 BL00383E 10.35 

I. 000e-14 333-344 
BL00383E 10.35 7.300e- 
14 628-639 BL00383F 
15.51 1.720e-13 371- 
387 BL00383C 10.10 
3.000e-13 217-228 
BL00383D 11.92 7.000e- 
13 295-308 BL00383B 
7.61 1.692e-ll 187-196 . 
BL00383C 10.10 1.750e- 
09 509-520 BL00383D 

II. 92 4.000e-09 589- 
602 BL00383B 7.61 
8.000e-09 479-488 


191 


PRO 04 50 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 7.911e- 
15 83-105 PR00450C 
12.22 6.286e-13 47-69 


193 


PF0056"4 


Octicosapeptide repeat 
proteins . 


PF00564B 24.74 6.164e- 
16 227-278 


194 


PR00503 


BROMODOMAIN SIGNATURE 


PR00503D 20.81 9.156e- 
15 204-224 PR00503B 
9.96 9.57le-13 170-187 


195 


BL00901 


Cysteine 

synthase/cystathionine 
beta-synthase P- 
pnospnate act . 


BL00901C 20,63 3.429e- 
18 67-117 


197 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 6.21le- 
17 40-57 BL00636B 
15.11 2.000e-13 67-88 


198 


PR00690 


ADHESIN FAMILY SIGNATURE 


PR00690A 10.86 9.866e- 
09 463-482 


199 


BL01131 


Ribosomal RNA adenine 
dimethylases proteins. 


BL01131A 26.62 2.343e- 
12 84-130 


201 


PR00910 | 


LUTEOVIRUS QRF6 PROTEIN 

C T f~* hT7v tit rn t-i 


PR00910A 2.51 8.352e- 
12 509-522 


203 


DM00215 


PROLINE-RICH PROTEIN 3 . 


DM00215 19.43 2.286e- 
10 39-72 


206 


PR00261 


LOW DENSITY LIPOPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261A 11.02 4.462e- 
19 65-87 PR00261C 
11.37 9.308e-19 65-87 
PR00261D 12.47 2.667e- 
18 65-87 PR00261B 
14.12 4.000e-lQ 143- 
165 PR00261A 11.02 
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NO. 


DESCRIPTION 


RESULTS* 








4.833e-18 143-165 
PR00261D 12.47 7.500e- 
18 143-165 PR00261B 
14.12 5.065e-16 65-87 
PR00261C 11.37 8.967e- 
16 143-165 PR00261F 
11.57 4.938e-13 143- 
165 PR00261E 11.08 
7.1B8e-13 65-87 
PR00261F 11.57 7.188e- 
13 65-87 PR00261E 
11.08 1.643e-ll 143- 
165 


209 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 6\ 143e- 
13 118-173 PF00791C 
20.98 7.680e-10 132- ! 
171 


211 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 


PR00007A 19.33 5.781e- 
19 131-158 PR00007B 
14.16 4.115e-18 158- 
178 PR00007C 15.60 
1.675e-15 201-223 
PR00007D 9.64 7.231e- 
11 233-244 


212 


BL00183 


Ubi qui tin -conjugating 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 


213 


BL00183 


Ubiqui tin- conjugating 
enzymes proteins. 


BL00183 28.97 1.545e- 
30 43-91 


215 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 1.900e- 
29 568-614 BL00039A 
18.44 1.871e-23 21-60 
BL00039C 15.63 1.720e- 
11 364-388 BL00039B 
19.19 4.064e-ll 277- 
303 


217 


BL00100 


Chloramphenicol 
acetyl transferase 
proteins . 


BL00100D 17.22 8.484e- 
09 68-106 


219 


PR00213 


MYELIN P0 PROTEIN 
S IGNATURE 


PR00213C 15.94 3.969e- 
11 199-227 


22 2 


BL00678 


Trp-Asp (WDJ repeat 
proteins proteins. 


BL00678 $.67 1.947e-09 
144-155 


224 


PR00875 


MOLLUSC METALLOTHIONEIN 
S IGNATURE 


PR00875A 5.83 l.OOOe- 
09 901-913 


225 


BL00636 


Nt-dnaJ domain proteins. 


BL00636B 15.11 8.200e- 
19 18-39 


226 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 1.000c- 
21 21-38 BL00636B 
15.11 8.200e-19 45-66 


229 


PR00301 


70 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00301F 13.98 7.563e- 
13 329-346 PR00301G ' 
13.78 4.300e-12 361- 
382 


230 


BL00460 


Glutathione peroxidases 
selenocysteine proteins. 


BL00460A 28.67 8.773e- 
20 35-70 BL00460B 
9.73 7.429e-16 7B-96 
BL00460C 14.35 2.831e- 
12 111-134 BL00460D 
16.89 8.773e-ll 140- 
160 


231 


PR0O647 


SENR ORPHAN RECEPTOR 
SIGNATURE 


PR00647B 10.19 8.522e- 
09 273-287 


233 


BL00292 


Cyclins proteins. 


BL00292B 20.31 7.429e- 
27 244-275 BL00292A 
22.87 7.750e-27 201- 
235 


234 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 6.308e- 
13 7-29 PR00449C 
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ACL-bbblON 
NO. 


D b b CK I PT I ON 


RESULTS* 








17.27 4.462e-ll 47-70 
PR00449D 10.79 7.120e- 
11 109-123 




T\n ft ft « -i r\ 

PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.300e- 
10 251-265 PR00019B 
11.36 5.320e-09 119- 
133 PR00019B 11.36 
1.000e-08 229-243 


236 


PR00019 


LEUCINE-RICH REPEAT 


PR00019B 11.36 7.300e- 
10 245-259 PR00019B 
11.36 5.320e-09 113- 
127 PR00019B 11.36 
1.000e-08 223-237 


^ J / 




PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 8.448e-09 
67-81 


240 


pDnnni i 

irlw UU11 


SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


241 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011D 14.03 3.492e- 
10 616-635 


244 


BL00903 


Cytidine and 
deoxycy tidy late 
deaminases zinc-binding 
region s. 


BL00903 12.93 8.941e- 
12 54-64 


245 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 8.043e- 
09 124-134 


246 


BL00246 


Wnt-1 family proteins. 


BL00246D 23.97 l.OOOe- 
40 186-239 BL00246E 
20.32 1.000e-40 SOS- 
SSI BL00246B 13.69 
4.176e-36 105-140 
BL00246A 15.75 2.286e- 
24 70-90 BL00246C 
15.56 4 .857e-22 150- 
175 


250 


PR00927 


ADENINE NUCLEOTIDE 
TRANS LOCATOR 1 SIGNATURE 


PR00927E 14.93 5.114e- 
10 253-275 


254 


BL00674 


AAA-protein family 
proteins . 


BL00674B 4.46 l.OOOe- 
09 223-245 


255 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01794 15.01 £.045e- 
09 61-88 


256 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002B 15.18 2.800e- 
10 421-435 


258 


PR00094 


ADENYLATE KINASE 
SIGNATURE 


PR00094C 12.94 2.200e- 
18 87-104 PR00094D 
12.52 2.731e-14 161- 
177 PR00094A 10.31 
5.500e-14 11-25 
PR00094B 11.01 4.115e- 
13 39-54 PR00094E 
11.25 7.333e-l3 178- 
193 


259 


BL00892 


HIT family proteins. 


BL00892A 18.17 5.500e- 
13 60-91 


262 


BL003 8 8 


Proteasome A- type 
subunits proteins . 


BL00388A 23.14 l.OOOe- 
40 8-54 BL0O388B 
31.38 3.864e-33 66-108 
BL00388D 20.71 l.OOOe- 
21 153-184 BL00388C 
18.79 8.147e-16 126- 
148 


2*4 


BL00903 


Cytidine and 
deoxycyt idyl ate 
deaminases zinc-binding 
region s. 


DL00903 12.93 5.821e- 
09 91-101 


267 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 1.529e- 
09 241-257 


270 • 


"BL6022r5 


Intermediate filaments 
proteins. 


BL00226D 19.10 l.OOOe- 
37 362-409 BL00226B 
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NO. 
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23.86 8.043e-35 196- 
244 BL00226C 13.23 
7.000e-20 261-292 
BL00226A 12.77 6 . 143e- 
15 96-111 


271 


PD02952 


KINASE TRANSFERASE 
CHOLINE PROTEIN 
MULTIGENE FAMI . 


PD02952C 15.76 9.731e- 
16 235-265 PD02952B 
15.57 5.625e-09 215- 
229 


272 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 l.OOOe- 
40 106-160 PD02929B 
18.36 8.800e-17 179- 
199 


274 


BL01027 


Glycosyl hydrolases 
family 39 proteins. 


BL01027B 15.34 3.486e- 
09 213-250 


275 


PR00424 


ADENOSINE RECEPTOR 
SIGNATURE 


PR00424D 14.32 6.451e- 
11 39-59 


277 


BL00052 


Ribosomal protein S7 
proteins. 


BL00052A 27.85 6 . OOOe- 
13 137-184 BL00O52B 
15.17 5.143e-12 208- 
235 


279 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790N 13.25 5.659e- 
13 267-294 


280 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 l.OOOe-21 89-105 
PR00319A 15.27 8.364e- 
21 51-68 PR00319B 
11.47 8.200e-19 70-85 


281 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 94-112 PR00319C 
13.41 1.000e-21 76-92 
PR00319A 15.27 8.364e- 
21 38-55 PR00319B 
11.47 8.200e-19 57-72 


287 


PF00929 


Exonuclease . 


PF00929D 16.17 7.366e- 
09 149-163 


291 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


292 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 2.360e- 
09 93-127 


294 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 8.714e- 
12 203-216 


295 


BL00028 


Zinc finger, C2H2 type, 
domain proteins . 


BL00028 16.07 5.500e- 
15 322-339 BL00028 
16.07 9.471e-14 4'33- 
450 BL00028 16.07 
4.600e-13 648-665 
BL00028 16.07 5.500e- 
13 760-777 BL00028 
16.07 9.550e-13 788- 
805 BL00028 16.07 
3.348e-12 704-721 
BL00028 16.07 6.478e- 
12 461-478 BL0002B 
16.07 8.435e-12 844- 
861 BL00028 16.07 
1.692e-ll 593-610 
BL00028 16.07 2.038e- 
11 211-228 BL0002B 
16.07 5.154C-11 732- 
749 BL00028 16.07 
5.846e-ll 377-394 
BL00028 16.07 6.885e- 
11 816-833 BL00028 
16.07 7.231e-ll 676- 
693 BL00028 16.07 
9.654e-ll 564-581 
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NO. 


DESCRIPTION 
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BL00028 16.07 4.086e- 
09 517-534 BL00028 
16.07 7.429e-09 489- 
50S 


296 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 8.333e- 
16 111-136 BL00215A 
15.82 2.723e-ll 10-35 
BL00215B 10.44 9.526e- 
11 152-165 BL00215B 
10.44 7.375e-10 59-72 
BL00215A 15.82 9 . 824e- 
10 205-230 


302 


PF00953 


Qlycosyl transferase. 


PF00953C 19.70 8.773e- 
34 236-269 PF00953A 
19.68 5.000e-25 102- 
129 PF00953B 6.17 
1.000e-13 182-194 


304 


PF00152 


tRNA synthetases class 
II. 


PF00152D 21.30 8.364e- 
28 422-461 PF00152C 
28.03 9.250e-21 220- 
257 PF00152B 15.67 
2.658e-13 159-184 
PF00152A 19.68 5.714e- 
11 44-67 


305 


PD01066 


PROTEIN 2INC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.250e- 
35 37-76 


306 


PD02784 


PROTEIN NUCLEAR 
RIBONUCLEOPROTEIN . 


PD02784B 26.46 5.840e- 
09 92-135 


307 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454C 11.24 7.808e- 
09 1167-1186 


308 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPER FAMILY SIGNATURE 


PR00237E 13.03 5.09le- 
13 188-212 PR00237G 
19.63 7.207e-13 268- 
295 PR00237A 11.48 
4,375e-ll 24-49 
PR00237C 15.69 3.057e- 
10 101-124 PR00237D 
8.94 4.750e-10 137-159 
PR00237F 13.57 5.364e- 
10 230-255 PR00237B 
13.50 9.438e-10 57-79 


309 


BL00522 


DNA polymerase family X 
proteins . 


BL00522C 11.90 7.577e- 
24 315-339 BL00522F 
14.90 1.310B-15 470- 
494 BL00522A 25.52 
1.265e-14 179-226 
BL00522ET19.63 8.615e- 
14 430-460 BL0052,2B 
27.30 9.625e-12 267- 
313 


310 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 5.235e- 
10 856-897 


312 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 4 . 706e- 
14 151-174 BL00290B 
13.17 9.000e-12 211- 
229 


313 


BL00345 


Ets-domain proteins. 


BL00345B 21.28 l.OOOe- 
40 34-85 BL00345A 
13.96 9.217e-16 1-20 


315 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 5.091e- 
15 63-76 


317 


BL01020 


SARI family proteins. 


BL01020C 15.35 3.198e- 
17 79-130 


318 


BL00216 


Sugar transport 
proteins . 


BL00216B 27.64 4.696e- 
11 164-214 


320 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 


PR00109B 12.27 4.814e- 
10 216-235 
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NO. 
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RESULTS * 






SIGNATURE 




321 


BL00027 


1 Homeobox 1 domain 
proteins . 


BL00027 26.43 5.688e- 
10 329-372 


322 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 8.765e- 
12 558-577 


324 


BL01241 


Link domain proteins. 


BL01241 35.81 8.313e- 
30 183-236 BL01241 
35.81 3.222e-13 282- 
335 


326 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 4.000e- 
12 515-566 BL00412D 
16.54 5.705e-ll 516- 
567 BL00412D 16.54 
7.848e-10 518-569 
BL00412D 16.54 1.827e- 
09 514-565 BL00412D 
16.54 1.9l8e-09 513- 
564 BL00412D 16.54 
2.102e-09 520-571 


32B 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins . 


BL00232B 32.79 9.557e- 
20 151-199 BL00232B 
32.79 2.246e-18 41-89 
BL00232B 32.79 5.985e- 
18 370-418 BL00232B 
32.79 5.500e-16 258- 
306 BL00232B 32.79 
9.384e-15 475-523 
BL00232C 10.65 2.537e- 
12 256-274 BL00232C 
10.65 4.326e-ll 368- 
386 BL00232C 10.65 
7.261e-ll 473-491 
BL00232C 10.65 7.457e- 
11 39-57 


330 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454d 11.24 7.808e- 
09 1167-1186 


331 


BL00598 


Chromo domain proteins. 


BL00598 14.45 8.393e- 
18 27-49 


333 


BL01016 


Glycoprotease family 
proteins. 


BL01016C 22.84 3.925e- 
32 70-115 BL01016E 
14.88 5.286e-19 149- 
177 BL01016H 13.71 
7.577B-13 291-301 
BL01016D 8.86 3.298e- 
11 127-140 BL01016G 
7.14 5.622e-10 261-271 
BL01016A 5.65 7.167e- 
10 4-19 BL01016F 
13 .34 1.563e-09 200- 
212 BL01016B 8.93 
8.855e-09 38-50 


33 9 


BL01115 


GTP- binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.500e- 
11 17-61 


340 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.231e- 
33 10-49 


341 


BLO1160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 5.042e- 
09 55-109 


342 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.400e- 
30 16-55 


343 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031A 16.80 l.OOOe- 
40 20-68 


346 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 4.764e- 
11 135-154 


347 


PR00109 


TYROSINE KINASE 


PR00109B 12.27 4.764e- 
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NO. 


DESCRIPTION ■ " ' 


RESULTS * 






CATALYTIC DOMAIN 
SIGNATURE 


11 135-154 


351 


BL01187 


Calcium-binding EGF-like 
domain proteins pattern 
proteins . 


BL01187B 12.04 1.783e- 
13 100-116 BL01187B 
12.04 8.435e-13 276- 
292 BL011B7B 12.04 
8.800e-ll 13-29 
BL01187B 12.04 7.429e- 
10 54-70 BL01187B 
12.04 5.725e-09 231- 
247 BT.01187A 9.98 
7.000e-09 255-267 


352 


PD00078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.950e- 
10 366-379 PD0007BB 
13.14 4.522e-09 168- 
181 


354 


BL00380 


Rhodanese proteins . 


BL00380F 9.7^ 6,694e- 
11 542-553 


355 


PF00628 


PHD- finger . 


PF00628 15.84 l.OOOe- 
11 116-131 


356 


PR00587 


SOMATOSTATIN RECEPTOR 
TYPE 1 SIGNATURE 


PR00587A 8.06 9.700e- 
09 17-37 


359 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 4.4$2e- 
15 261-274 PD00066 
13.92 6.500e-13 233- 
246 PD00066 13.92 
4 .300e-09 289-302 


361 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 9.604e- 
13 54-109 PF00791B 
28.49 1.095e-12 21-76 
PF00791A 27.85 1.432e- 
09 71-126 PF00791B 
28.49 7.440e-0S 184- 
239 


362 


PF00791 


Domain present in ZO-1 
and UncS-like netrin 
receptors . 


PF00791B 28.49 2.273e- 
11 279-334 


363 


PRO 04 50 


RECOVER IN FAMILY 
SIGNATURE 


PR00450C 12.22 5.080e- 
10 73-95 PR00450C 
12 .22 3 .278e-09 109- 
131 


364 


PF00242 


DNA polymerase (viral) 
N- terminal domain 
proteins . 


PF00242Q 13.51 2.328e- 
09 22-68 


365 


PF00242 


DNA polymerase (viral) 
"N- terminal domain 
proteins . 


PF00242Q 13.51 2.328e- 
09 22-68 


366 


BL01160 


Kinesln light chain 
repeat proteins. 


BL01160B 19.54 6.644e- 
09 1038-1092 


367 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 229-243 PR00019B 
11.36 6.040e-09 91-105 
PR00019A 11.19 8.667e- 
09 370-384 


368 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PROOOllD 14.03 9.000e"= 
15 30-49 PROOOllA 
14.06 9.830e-15 30-49 
PROOOllB 13.08 4.500e- 
14 30-49 PROOOllC 
24.25 5.143e-09 6-35 


369 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032H 11.25 4.150e- 
09 417-430 


372 


BL00478 


LIM domain proteins. 


BL00478B 14.79 7.750e- 
12 410-425 


373 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL - 
BINDING NU. 


PD01066 19.43 9.757e- 
34 26-65 


376 


PR00170 


SODIUM CHANNEL SIGNATURE 


PR00170E 6.48 2.739e- 
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10 88-118 


380 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 l.OOOe- 
23 276-307 BL00107B 
13.31 1.692e-12 342- 
358 


•a hi 

381 


BL00455 


Putative AMP-binding 
domain proteins. 


BL00455 13.31 5 . 714e- 
12 50-66 


3 82 


PR00624 


HI STONE H5 SIGNATURE 


PR00624G 4.08 4 . 900e- 
09 524-544 




PD0007B 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD0007BB 13.14 5.950e- 
10 366-379 PD00078B 
13.14 4.522e-09 168- 
181 


385 


PR00511 


TEKTIN SIGNATURE 


PR00511D 7.11 5.371e- 
09 67-80 


38S 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR . 


PD02870B 18.83 6\000e- 
10 97-130 


388 


PD00066 


PROTEIN ZINC-FINGER 
METAL -BIND I . 


PD00066 13.92 5 . OOOe- 
13 516-529 


389 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290A 20.89 7.667e- 
09 151-174 


390 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 S.200e- 
15 221-246 BL00215A 
15.82 7.6l8e~14 20-45 
BL00215A 15.82 8.85le- 
11 123-148 BL00215B 
10.44 9.526e-ll 69-82 
BL00215B 10.44 7.300e- 
09 272-285 BL00215B 
10.44 8.500e-09 165- 
178 


394 


BL00674 


AAA-protein family 
proteins. 


BL00674B 4.46 2.723e- 
16 299-321 


397 


PRO0048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 B.579e- 
11 141-155 


398 


PR00761 


BINDIN PRECURSOR 
SIGNATURE 


PR00761B 9.93 6.764e- 
09 55-74 


399 


BLO0240 


Receptor tyrosine kinase 
class III proteins. 


BL00240B 24.70 7.907e- 
10 118-142 


401 


PFO0676 


Dehydrogenase El 
component . 


PF00676B 24.71 8.071e- 
18 331-369 PF00676D 
14.40 3.854e-15 486- 
506 PF00676C 16.88 
9.1B2e-14 454-478 


402 


BLO0514 


Fibrinogen beta and 
gamma chains C- terminal 
domain proteins. 


BL00514C 17.41 4.673e- 
28 4432-4469 BL00514G 
15.98 6.092e-14 4555- 
4585 BL00514D 15.35 
2.532e-12 4473-4486 
BL00514F 11.65 4.288e- 
10 4519-4534 BL00514H 
14.95 4.955e-10 4584- 
4609 


403 


PF00992 


Troponin . 


PF00992A 16.67 5.974e- 
09 105-140 


404 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.450e- 
10 73-87 PR00019A 
11.19 8.043e-10 76-90 
PR00019B 11.36 l.OOOe- 
09 50-64 PR00019B 
11.36 1.000e-09 96-110 


405 


BL00232 


Cadherins extracellular 
repeat proteins domain 
proteins. 


BL00232B 32.79 9.557e- 
20 139-187 BL00232B 
32.79 2.246e-18 29-77 
BL00232B 32.79 5.985e- 
18 358-406 BL00232B 
32.79 5.500e-16 246- 
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294 BL00232B 32.79 
9.384e-l5 463-511 
BL00232C 10.65 2.537e- 
12 244-262 BL00232C 
10.65 4.326e-ll 356- 
374 BL00232C 10.65 
7.261e-ll 461-479 
BL00232C 10.65 7.457e- 
11 27-45 


407 


PF00426 


Outer Capsid protein VP4 
(Hemagglutinin) . 


PF00426S 15.67 5.634e- 
09 902-940 


409 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 9.695e- 
09 126-180 


410 


BL00741 


Guanine -nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 2 . 731e- 
09 252-275 


411 


PF00646 


F-box domain proteins. 


PF00646A 14.37 6.344e- 
09 86-100 


412 


BL00603 


Thymidine kinase 
cellular-type proteins. 


BL00603B 11.39 8.500e- 
09 542-557 


415 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 3.571e- 
31 245-291 BL00866C 
23.26 9.000e-25 331- 
366 


418 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 6.114e- 
09 590-602 


421 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791B 28.49 7.955e- 
14 23-78 PF00791B 
28.49 3.653e-12 273- 
328 PF00791B 28.49 
4.273e-ll 156-211 j 
PF00791B 28.49 7.818e- 
11 89-144 PF00791B 
28.49 1.524e-10 56-111 
PF00791C 20.98 3.559e- 
09 37-76 PF00791C 
20.98 5.235e-09 170- 
209 PF00791C 20.98 
5.235e-09 381-420 
PF00791B 28.43 6.202e- 
09 189-244 PF00791B 
28.49 7.028e-09 435- 
490 PF00791B 28.49 
6.679e-09 367-422 


424 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 7.207e- 
28 1645-1679 


425 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109D 17.04 5.881e- 
10 228-251 


429 


BL0051B 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL0051B 12.23 4.600e- 
11 31-40 


431 


BL00039 


DEAD -box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 1.844e- 
34 490-536 BL00039A 
18.44 5.615e-19 205- 
244 BL00039B 19.19 
8.920e-16 251-277 
BL00039C 15.63 5.781e- 
15 333-357 


432 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 7.652e- 
12 169-185 


433 


PR00828 


FORM IN SIGNATURE 


PR00828B 5.23 8.218e- 
10 382-405 


436 


BL00415 


Synapsins proteins. 


BL00415N 4.29 8.643e- 
11 195-239 BL00415N 
4.29 3.036e-09 809-853 


443 


PR00834 


HTRA/DEGQ PROTEASE 
FAMILY SIGNATURE 


PR00834F 10.91 6.040e- 
11 221-234 


446 


PF01140 


Matrix protein (MA) , 


PF01140D 15.54 9.663e- 
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P15. 


10 183-218 PF01140D 
15.54 3.093e-09 246- 
281 


449 


PR00568 


DOPAMINE D3 RECEPTOR 
SIGNATURE 


PR00568G 13.95 5.551e- 
09 39-53 


451 


PF00O84 


Sushi domain proteins 
(SCR repeat proteins. 


PF00084B 9.45 3.813e- 
10 47-59 


452 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.82le- 
09 618-649 


456 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 l.OOOe- 
25 77-99 PR00380D 
9.93 1.000e-21 281-303 
PR00380C 13.18 8.286e- 
17 230-249 PR00380B 
12.64 4.724e-l6 194- 
212 


457 


PR00253 


GAMMA-AMINOBUTYRIC ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 9.143e- 
24 246-267 PR00253B 
13.47 2.000e-23 272- 
294 PR00253C 13.85 
7.000e-23 306-328 
PR00253D 16.68 5.950e- 
21 452-473 


467 


DDnnoz q 
Jri\U u o^t a 


GLYCOSYL HYDROLASE 
FAMILY 58 SIGNATURE 


PR00849D 9.77 9.236e- 
09 910-937 


t 1 J. 


n.bUUb /o 


Trp-Asp (WD> repeat 
proteins proteins. 


BL00678 9.67 8.200e-12 
33-44 


472 


BL00226" 


Intermediate filaments 
proteins . 


BL00226B 23.86 3.721e- 
09 282-330 


473 


BL00344 


GATA-type zinc finger 
domain proteins. 


BL00344 17.99 7.000e- 
12 814-852 


474 


BL00481 


Thiol-activated 
cytolysins proteins. 


BL00481E 13.07 8.909e- 
09 173-199 


*i f J 


PRQ0319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319B 11.47 2.571e- 
09 393-408 


480 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 1.900e- 
38 8-47 


481 


PR00405 


"HIV REV INTERACTING} 
PROTEIN SIGNATURE 


PR00405C 19.41 l.OOOe- 
19 451-473 PR00405B 
11.83 4.333e-18 430- 
448 PR00405A 17.71 
4.971e-18 411-431 


482 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.286e- 
10 959-974 PR00049D 
0.00 9.857e-10 958-973 
PR00049D 0.00 1.305e- 
09 937-952 PR00049D 
0.00 8.322e-09 939-954 


486 


PR00007 


COMPLEMENT C1Q DOMAIN 
SIGNATURE 

• 


PR00007B 14.16 8.6l5e- 
23 653-673 PR00007A 
19.33 6.192e-22 626- 
653 PR00007C 15.60 
5.846Q-19 698-720 
PR00007D 9.64 3.647e- 
13 732-743 


4B7 


PD00567 


PROTEIN RNA-BINDING RNA 
REPEAT HYD. 


PD00567B 18.23 2.853e- 
09 200-214 


488 


PR00988 


Ut\,lUJHH£i m.lY/wCi OXVjllAI UKCj 


FKUU9ooA 6.39 4.569e- 
12 3-21 


489 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 4.882e- 
27 30-69 PD01066 
19.43 3.430e-10 71-110 


490 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.86*4e- 
09 663-678 




BiiOiiSa 


Shikimate kinase 
proteins . 


BL01128A 18.84 6.464e- 
17 58-92 


^497 


PF00429 


ENV polyprotein (coat 


PF00429 31.08 7.171e- 
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SEQ ID NO: " 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






polyprotein) . 


15 21-71 


498 


BL00120 


Lipases serinp 
proteins. 


DLUUl/Ub 11. 37 7 . 923e- 
09 185-200 


500 


" BL00030 


Eukaryot ic RNA- b inding 
region RNP-1 proteins. 


BL00030A 14.39 7.353e- 
11 299-318 


501 


BL01159 


nnf iopj/ nrif uonidin 

proteins. 


BL01159 13.85 8.579e- 

12 131-146 I 


505 


BL00021 


ivl j,c uuillaJ.Il LJi. vj L-tJ JLIIe* . 


aLiVWdUB 13.33 3.739e- 
17 492-510 


508 


PR00120 


(PROTON PUMP) SIGNATURE 


PRO 012 0C 9.90 5 . 800e- 
19 705-722 


509 


DM01417 


o is,w J.1M1/UL. ArWLz 

MUSHROOM SPAC22G7.04. 


DM01417E 20.62 2.938e- 
16 362-395 DM01417D 
11.08 3.800e-13 322- 
338 


510 


PF00534 


Glycosyl transferases 
group l . 


PF00534B 14.47 6.^25e- 
09 346-370 


511 


PF00534 


Glycosyl transferases 
group l . 


PF00534B 14.47 £.£25e- 
09 293-317 


512 


PF00534 


Glycosyl transferases 
group 1 . 


PF00534B 14.47 6.625e~ 
09 366-390 


513 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 110-160 PD01841B 
14.35 1.0QOe-40 181- 
222 PD01841D 17.87 
l.OOOe-40 243-295 
PD01841F 13.36 l.OOOe- 
40 333-3B2 PD01841G 
24.26 l.OOOe-40 386- 
440 PD01841L 18.42 
l.OOOe-40 968-1010 
PD01841I 23.00 4.545e- 
37 762-B04 PD01841E 
18.60 3.750e-36 295- 
333 PD01841J 14.94 
6.023e-35 851-888 
PD01841H 21.30 2.909e- 
33 490-527 PD01841K 
14.81 7.088e-33 924- 
954 PD01841C 13.78 
9.386e-23 222-243 
PD01841M 10.82 8.594e- 
21 1054-1073 PD01841I 
23.00 2.667e-13 549- 
591 


514 


PRO 01 53 


CYCLOPHILIN PEPTIDYL- 
PROLYL CIS - TRANS 

T^OMVPaQS 1 C T PMATT TTD IT" 

lounriRhon oloJNAlUKrj 


PR00153C 11.01 7.188e- 
13 95-111 PR00153E 
9.10 4.150e-12 122-138 


515 


BL00740 


MAM domain proteins. 


BL00740A 13.87 7.18Be- 
12 410-423 


516 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.087e- 
12 1018-1052 


517 


BL00242 


Integrins alpha chain 
proteins . 


BL00242C 16. B6 8.320e- 
09 12-42 


523 


DM00031 

0 


IMMUNOGLOBULIN V REGION.- 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 1.000e-25 84-118 


525 


BL00319 


Amyloidogenic 
glycoprotein 
extracellular domain 
proteins . 


BL00319C 17.12 8.375e~ 
10 61-95 


526 


PF00789 


Domain present in 
ubiquit in- regulatory 
proteins . 


PF00789B 19.70 3.308e- " 
12 322-343 PF00789C 
20.98 5.269e-09 367- 
392 


528 


BL01162 


Quinone oxidoreductase / 
zeta-crystallin 
proteins . 


BL01162C 22.80 l.SOOe- 
16 120-164 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION " 


RESULTS* 


529 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.bl 3.893e- 
09 60-73 


532 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215A 15.82 4 . OOOe- 
17 11-36 BL00215A 
15.82 8.660e-ll 123- 
14 8 


533 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15. B2 4.000e- 
17 11-36 BL00215A 
15.82 8.660e-ll 97-122 


534 


BL00098 


Thiolases acyl-enzyme 
intermediate proteins. 


BL00098C 21.65 2.800e- 
38 181-227 BL00098B 
32.59 5.345e-38 86-141 
BL00098D 26.30 8.364e- 
35 245-288 BL00098E 
22.12 1.000e-34 314- 
352 BL00098F 10.18 
4.971e-22 365-386 
BL00098A 10.60 6.455e- 
11 38-50 


535 


PR00370 


FLAVIN-CONTAINING 
MONOOXYGENASE (FMO) 
SIGNATURE 


PR00370£ 11.96 7.429e- 
22 321-340 PR00370D 
16.33 6.143e-21 185- 
204 PR00370F 17.75 
6.559e-21 376-396 
PR00370B 10.91 9.591e- 
21 27-46 PR00370C 
12.72 3.500e-20 140- 
157 PR00370A 3.35 
6.442e-17 4-20 


536 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.429e- 
16 285-302 BL00028 
16.07 6.294e-14 341- 
358 BL00028 16.07 
1.346e-ll 369-386 
BLO0028 16.07 1.692e- 
11 397-414 BL00028 
16.07 4.462e-ll 453- 
470 BL00028 16.07 
7.231e-ll 425-442 
BLOO028 16.07 4.300e- 
10 313-330 


537 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 844-881 


538 


BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 819-856 


539 


"'BL00762 


WHEP-TRS domain 
proteins . 


BL00762A 23.43 9.419e- 
15 822-859 


540 


PR00985 


LEUCYL-TRNA SYNTHETASE 
SIGNATURE 


PR00985A 12.10 9.000e- 
10 357-375 


541 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL. 


PD02102A 16". 74 l.OOOe- 
40 3-47 PD02102B 
18.28 4.375e-34 57-100 
PD02102D 21.69 1.923e- 
30 179-218 PD02102C 
26.34 B.929e-26 100- 
146 


543 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 l.OOOe- 
10 48-65 BL00028 
16.07 6.400e-10 193- 
210 BL00028 16.07 
1.000e-09 343-360 
BL00028 16.07 6.914e- 
09 78-95 


545 


BL00250 


TGF-beta family 
proteins. 


BL00250A 21.24 8.000e- 
31 293-329 BL00250B 
27.37 5.286e-24 354- 
390 


547 


PR00319 


BETA G- PROTEIN 


PR00319B 11.47 2.714e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION ' ■ " 


RESULTS* 






{TRANSDUCIN) SIGNATURE 


09 186-201 PR00319A 
15.27 7.344e-09 210- 
227 


548 


BL01204 


NF- kappa -B/Rel/ dorsal 
domain proteins . 


BL01204A 17.74 l.OOOe- 
40 8-56 BL01204D 
16.42 1.000e-40 177- 
221 BL01204E 13.83 
7.652e-30 225-250 
BL01204C 13.93 8.714e- 
22 141-160 BL01204B 
15.41 4.333e-16 102- 
116 


549 


PR00326 


GTP1/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.364e- 
15 255-276 


551 


PF00632 


HECT-domain (ubiquitin- 
transferase) . 


PF00632C 20.66 3.302e- 
23 1569-1601 PF00632B 
IB. 45 3.700e-21 1515- 
1543 


5^4 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins. 


BL00290B 13.17 1.600e- 
14 187-205 BL00290A 
20.89 2.059e-14 130- 
153 


557 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.339e- 
09 846-879 


559 


DM01111 


4 kw PHOSPHATASE 
TRANSFORMING 61K PDF1 . 


DM01111L 11.93 3.762e- 
09 7-35 


562 


PF00658 


Poly-adenylate binding 
protein, unique domain 
proteins . 


PF00658C 16.33 9.455e- 
32 118-155 


564 


BL00141 


Eukaryotic and viral 
aspartyl proteases 
proteins . 


BL00141A 12.10 4.150e- 
10 472-488 


566 


PF00B55 


PWWP domain proteins. 


PF00855 13.75 5.667e- 
15 272-289 


567 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD0106^ 1^.43 4.977e- 
13 229-268 


569 


BL001D7 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


570 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107A 18.39 7.000e- 
19 118-149 BL00107B 
13.31 5.500e-15 183- 
199 


572 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 454-483 PR00193C 
12.60 2.636e-31 223- 
251 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 508- 
537 


573 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 1.857e- 
34 470-499 PR00193C 
12.60 2.636e-31 239- 
267 PR00193B 11.69 
7.750e-29 171-197 
PR00193A 15.41 2.588e- 
22 115-135 PR00193E 
19.47 6.559e-19 524- 
553 


575 


BL00752 


XPA protein. 


DL00752B 19.17 9.703e- 
10 885-929 


576 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.3ST 7.000e- 
09 275-295 


577 


BL0011S 


DNA polymerase family B 


BL00116A 12.81 5.737e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins . 


13 864-877 BL00116B 
11.82 1.529e-12 952- 
965 


578 


BL00195 


Glutaredoxin proteins. 


BL00195B 15.31 7.158e- 
09 121-141 


579 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 9.000e- 
11 217-231 PR00019B 
11.36 1.350e-09 386- 
400 PR00019A 11.19 
3.333e-09 389-403 
PR00019B 11.36 8.920e- 
09 363-377 


"580 


PR00253 


GAMMA- AM INOBUTYR I C ACID 
(GABA) RECEPTOR 
SIGNATURE 


PR00253A 9.15 2.125e- 
25 275-296 PR00253B 
13.47 7.923e-24 301- 
323 PR00253D 16.68 
5.846e-23 444-465 
PR00253C 13.85 2.241e- 
20 335-357 




PR00343 


SELECTIN SUPERFAMILY 
COMPLEMENT- BINDING 
REPEAT SIGNATURE 


PR00343C 16.85 2.2B6e- 
11 1233-1252 PR00343C 
16.85 5.500e-ll 333- 
352 PR00343C 16.85 
5.500e-ll 783-802 
PR00343C 16 .85 A .246e- 
10 1491-1510 PR00343C 
16.85 8.230e-10 1686- 
1705 


584 


DM01537 


kw SKI2W SKI2 NUCLEOLAR 
HELICASE. 


DM01537B 21.63 1.878e- 
37 79-126 DM01537B 
21.63 9.491e-30 916- 
963 DM01537A 15.14 
3 .186e-ll 784-804 


586 


PF00013 


KH domain proteins 
family of RNA binding 
proteins . 


PF00013 5.78 1.450e-09 
124-136 


5B7 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 4.409e- 
13 262-296 


589 


BL00478 


LIM domain proteins. 


BL00478B 14.79 1.643e- 
13 261-276 BL00478B 
14.79 7.709e-09 321- 
336 


590 


PF00855 


PWWP domain proteins. 


PF00855 13 .75 8.000e- 
15 931-948 


"*91 


PF00855 


PWWP domain proteins . 


PF00855 13.75 S.OOOe- 
15 1062-1079 


593 


PF00628 


PHD- finger . 


PF00628 15.84 3.455e- 
12 424-439 


594 


PRO 02 05 


CADHERIN SIGNATURE 


P£00205B 11.39 2.241e- 
16 558-576 PR00205A 
14.73 9.308e-13 542- 
558 PR00205C 13.65 
5.304e-12 594-609 
PR00205B 11.39 4.273e- 
10 336-354 


596 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4.789e- 
18 307-338 


598 


PD01675 


GLYCOPROTEIN MAJOR 
ENVELOPE PROBABLE U3 . I 


PD01675C 19.89 2.330e- 
10 55-89 


600 


BL00242 


Integrins alpha chain 
proteins . 


BL00242E 9.03 9.591e- 
27 985-1014 BL00242C 
16.86 4.1l5e-26 286- 
316 BL00242D 13.57 
4.150e-25 357-382 
BL00242B 8.13 7.353e- 
12 189-199 BL00242D 
13.57 3.455e-ll 421- 
446 BL00242A 13.80 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








5.000e-ll 61-73 
BL00242D 13.57 4.986e- 
10 291-316 


601 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320A 16.74 5.6l0e- 
09 198-213 


602 


PR00278 


PANCREATIC HORMONE 
SIGNATURE 


PR00278A 12.43 4.569e- 
10 331-348 


603 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479C 12.01 3.250e- 
12 170-183 


604 


BLi00315 


Dehydrins proteins . 


BL00315A 9.35 1.672e- 
09 424-452 


605 


BL00415 


Synapsins proteins. 


BL00415N 4.29 9.794e- 
10 295-339 


606 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 l.OOOe- 
13 335-35B 


608 


PF00855 


PWWP domain proteins . 


PF00855 13.75 5.167e- 
15 265-282 


609 


PF00855 


PWWP domain proteins. 


PF00855 13.75 5.167e- 
15 211-228 


612 


DM01206 


CORONAVIRUS NUCLEOCAPSID . 
PROTEIN . 


DM01206B 10.69 7.411e- 
10 877-897 DM01206B 
10.69 8.027e-10 861- 
881 DM01206B 10.69 
9.137e-10 873-893 
DM01206B 10:69 1.456e- 
09 859-879 DM01206B 
10.69 1.797e-09 879- 
899 DM01206B 10.69 
4.076e-09 865-885 
DM01206B 10.69 7.038e- 
09 898-918 DM01206B 
10.69 7.949e-09 871- 
891 DM01206B 10.69 
8.291e-09 767-787 


615 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699A 8.91 2 . 023e- 
28 129-158 PD02699C 
24.84 1.000e-27 317- 
364 PD02699B 18.28 
1.000e-17 158-182 


616 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455 • 


617 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 4.086e- 
22 288-310 PR00380D 
9.93 3.721e-17 486-508 
PR00380B 12.64 2.241e- 
16 410-428 PR00380C 
13.18 2.976e-13 436- 
455 


618 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN . 


DM01206B 10.69 5 . 143e- 
12 531-551 DM01206B 
10.69 2.603e-10 535- 
555 


621 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 3.160e- 
21 561-582 


622 


BL00239 


Receptor tyrosine Kinase 
class II proteins. 


BL00239F 28.15 3.222e- 
10 647-692 BL00239C 
18.75 8.304e-10 543- 
566 


623 


PR00407 


EUKARYOTIC MOLYBDOPTER IN 
DOMAIN SIGNATURE 


PR00407K 9.94 8.448e- 
09 326-339 


624 


BL00641 


Respiratory-chain NADH 
dehydrogenase 75 Kd 


BL00641C 21.10 l.OOOe- 
40 157-202 BL00641E 



211 



WO 01/53312 



PCT7US00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






subunit proteins . 


24.37 1.000e-40 255- 
308 BL00641F 33.12 
1.000e-40 571-623 
BL00641A 17.15 1.818e- 
37 48-80 BL00641B 
12.62 5.846e-34 113- 
139 BL00641D 13.23 
9.308e-29 216-240 


627 


PR00103 


CAMP - DEPENDENT PROTEIN 
KINASE SIGNATURE 


PR00103E 17.80 2.500e- 
18 367-380 PR00103B 
13.39 2.080e-14 297- 
312 PR00103A 9.59 
2.957e-14 282-297 
PR00103D 10.83 3 . 077e- 
12 346-358 PR00103C 
15.68 1.000e-ll 334- 
344 PR00103B 13.39 
1.450e-ll 175-190 
PR00103A 9.59 1.720e- 
10 160-175 


630 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00OB1A 10.53 6.211e- 
16 4-22 


631 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 8.500e- 
14 37-50 


632 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM0120£B 10.69 2.233e- 
10 1324-1344 DM01206B 
10.69 4.822e-10 1276- 
1296 DM01206B 10.69 
7.658e-10 1328-1348 
DM01206B 10.69 8.274e- 
10 1280-1300 DM01206B 
10.69 4.532e-09 1320- 
1340 DM01206B 10.69 
7.266e-09 1326-1346 


bib 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 7.600e- 
23 145-176 BL00107B 
13.31 2.636e-13 211- 
227 


636 


BL00657 


Fork head domain 
proteins . 


BL00657A 19.39 1.545e- 
30 101-143 BL00657B 
22.27 7.750e-26 149- 
192 


637 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 l.OOOe- 
10 607-623 


643 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 4.913e-09 
199-212 


647 


PF00628 


PHD- finger. 


PF00628 15.84 2.350e- 
13 385-400 PF00628 
15.84 3.455e-12 464- 
479 


648 


BL01129 


Hypothetical 
yabO/yceC/sf hB family 
proteins. 


BL01129E 13 .25 4.000e- 
25 332-357 BL01129C 
25.56 8.200e-23 236- 
279 BL01129B 12.51 
6.118e-13 191-212 


649 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 3.908e- 
10 455-480 


650 


DJjUUUZ f 


1 Home ob ox ' domain 
proteins . 


BL00027 26.43 6.684e- 
13 771-814 


651 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002A 14.19 1.750e- 
12 1026-1045 


653 


PR00253 


GAMMA-AMINOBUTYRIC ACID 
(GABA) RECEPTOR i 
SIGNATURE 


PR00253A 9.15 4.000e- 
24 253-274 PR00253C 
13.85 8.800e-24 313- 
335 PR00253B 13.47 
3.143e-22 279-301 
PR00253D 16.68 7.652e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








20 422-443 


654 


PD01719 


PRECURSOR GLYCOPROTEIN 
SIGNAL RE. 


PD01719A 12.89 4.452e- 
11 969-997 PD01719A 
12.89 3.961e-10 128- 
156 PD01719A 12.89 
7.395e-10 1276-1304 
PD01719A 12.89 1 . 222e- 
09 1220-1248 


657 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 563-578 


658 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins * 
(Ahook) . 


BL00354C 6.61 8.397e- 
09 580-595 


659 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2. l?4e- 
13 539-572 DM00215 
19.43 4.750e-12 549- 

582 DM00215 19.43 
9.824e-ll 551-584 
DM00215 19.43 2.929e- 
10 548-581 DM00215 
19.43 4.054e-10 550- 

583 DM00215 19.43 
5.339e-10 552-585 
DM00215 19.43 7.107e- 
10 544-577 


660 


PR00688 


XYLOSE ISOMERASE 
SIGNATURE 


PR00688I 13.78 9.518e- 
09 224-236 


661 


BL00027 


1 Homeobox ' domain 
proteins . 


BL00027 26.43 5.950e- 
23 249-292 


662 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


663 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


664 


PR00360 


C2 DOMAIN SIGNATURE 


PR00360B 13.61 7.158e- 
10 596-610 


£66 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 8.988e- 
10 704-720 


667 


BL50040 


Elongation factor 1 
gamma chain profile. 


BL50O4OC 22.62 2.1436- 
16 135-178 


668 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 1.360e- 
09 139-153 PR00019A 
11.19 1.667e-09 94-108 
PR00019B 11.36 4.600e- 
09 163-177 


670 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 3.250e-10 
681-694 BL00018 7.41 
6.400e-10 717-730 


672 


PD00131 


ATP-BINDING TRANSPORT 
TRANSMEM3R. 


PD00131B 34.97 l.OOOe- 
34 356-410 PD00131C 
19.59 1.346e-26 504- 
542 


673 


PR00667 


RETINAL PIGMENT 
EPITHELIUM-RETINAL GPCR 
SIGNATURE 


PR00667G 15.33 7.$57e- 
10 106-123 


674 


PR00320 


G-PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 593-608 PR00320B 
12.19 4.115e-12 635- 
650 PR00320C 13.01 
8.435e-ll 717-732 
PR00320C 13.01 2.800e- 
10 635-650 PR00320C 
13.01 6.400e-10 593- 
608 PR00320B 12.19 
3.2506-09 593-608 


^75 


PR00320 


G- PROTEIN BETA WD-40 
REPEAT SIGNATURE 


PR00320A 16.74 4.857e- 
13 572-587 PR00320B 
12.19 4.115e-12 614- 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 








629 PR00320C 13.01 
B.435e-ll 696-711 
PR00320C 13.01 2.800e- 
10 614-629 PR00320C 
13.01 6.400e-10 572- 
587 PR00320B 12.19 
3.250e-09 572-587 


676* 




lihiUi~.LNb-K.LLri KEPliAT 
SIGNATURE 


PR00019A 11.19 9.667e- 
09 249-263 


679 




Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 3:700e- 
16 225-236 PF00642 
11.59 7.900e-12 187- 
198 


680 


PR0030B 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 8.754e- 
10 286-296 


681 


BL00019 


Actinin-type actin- 
binding domain proteins. 


BL00019D 15.33 4.20ue- 
19 227-257 


682 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 4.000e- 
09 99-118 


687 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.500e- 
10 538-553 


689 


BL01024 


Protein phosphatase 2A 
regulatory subunit PR55 
proteins . 


BL01024A 10.26 l.OOOe- 
40 22-69 BL01024B 
8.91 1.000e-40 86-127 
BL01024C 7.80 l.OOOe- 
40 146-185 BL01024D 
13 .22 1.000e-40 185- 
222 BL01024E 11.96 

I. 000e-40 222-266 
BL01024F 9,42 l.OOOe- 
40 266-317 BL01024G 

II. 09 1.000e-40 317- 
349 BL01024H 13 . 88 
1.000e-40 389-442 


691 


BL00027 


'Homeobox' domain 
proteins . 


BL00027 26.43 8.071e- 
31 152-195 


692 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


c q *a 
b y J 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 45-57 


694 


BL00211 


ABC transporters family 
proteins . 


BL00211A 12.23 5.050e- 
09 58-70 


696 


BL00680 


Methionine 

aminopeptidase subfamily 
1 proteins. 


BL00680 14 .37 5.304e- 
17 173-195 


697 


oJjUU74 1 


Guanine -nucleotide 
dissociation stimulators 
CDC 2 4 family sign. 


BL00741B 14.27 3.418e- 
11 242-265 


698 


DM01930 


2 kw FINGER SMCX SMCY 
YDR096W. 


DM01930E 15.41 1.367e- 
37 170-215 DM01930F 
14 .16 8.232e-28 267- 
303 DM01930B 19.86 
9.163e-10 37-71 


700 


PR00869 


DNA- POLYMERASE FAMILY X 
SIGNATURE 


PR00869A 12.80 1.281e- 
16 245-263 


701 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.174e- 
10 77-91 PR00048A 
10.52 6.870e-10 133- 
147 PR00048A 10.52 
8.826e-10 105-119 
PR00048A 10.52 5.320e- 
09 161-175 


702 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 2.565e- " 
25 326-356 BL00523A 
13.36 5.050e-16 38-55 
BL00523B 8.64 5.909e- 
15 86-98 BL00523C 
12.64 5.500e-13 137- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








148 BL00523D 9.89 
1.844e-U 290-302 
BL00523G 9.46 5.500e- 
10 513-523 BL00523F 
10.85 6.3Sle-09 413- 
424 


703 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 8.412e- 
12 376-390 PR0004BB 
6.02 1.000e-10 334-344 
PR00048B 6.02 1.474e- 
09 364-374 


707 


PD00787 


SYNTHASE BIOSYNTHESIS 
TRANSFERASE. 


PD00787A 14.84 8.941e- 
14 66-82 


708 


PR00761 


BINDIN PRECURSOR 
SIGNATURE 


PR00761E 14.32 8.500e- 
10 822-841 


712 


DM01354 


kw TRANSCRIPTASE REVERSE 
II ORF2. 


DM01354Y 10.69 4 . 977e- 
38 425-465 DM01354X 
13.86 7.300e-34 376- 
415 DM01354V 12.97 
4.923e-17 311-358 
DM01354W 12.64 5.596e- 
10 356-376 


713 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins. 


BL00039D 21.67 7.545e- 
27 450-496 BL00039A 
18.44 2.537e-18 147- 
186 BL00039C 15.63 
2.216e-14 280-304 
BL00039B 19.19 1.947e- 
13 194-220 


715 


BL00383 


Tyrosine specific 
protein phosphatases 
proteins . 


BL00383E 10.35 4.981e- 
10 150-161 


717 


"PF00777 


Sialyl transferase 
family. 


PF00777C 18.60 4.035e- 
21 106-161 


718 


DM00031 


IMMUNOGLOBULIN V REG t ON . 


DM00031A 16.80 3.750e- 
39 20-68 DM00031B 
15.41 2.688e-28 84-11B 
DM00031C 12.79 1.300e- 
12 131-142 


719 


BL00243 


Integrins beta chain 
cysteine-rich domain 
proteins. 


BL00243B 17.54 l.OOOe- 
40 131-172 BL00243C 
16.42 1.000e-40 172- 
208 BL00243D 24,07 
1.000e-40 222-274 
BL00243F 22.63 l.OOOe- 
40 314-358 BL00243I 
31.77 6.571e-39 607- 
650 BI.00243E 16.70 
3.077e-35 274-304 . 
BL00243G 21.38 3.625e- 
34 358-400 BL00243H 
17.53 5.235e-29 567- 
593 BL00243A 17.61 
3.250e-2l 63-84 
BL00243H 17.53 7.167e- 
16 477-503 BL00243H 
17.53 2.304e-ll 524- 
550 BL00243H 17.53 
5.304e-ll 606-632 
BL00243I 31.77 1.380e- 
09 610-653 


720 


PR00217 


4 3 KD POSTSYNAPTIC 
PROTEIN SIGNATURE 


PR00217C 10.91 8.022e- 
09 20-36 


722 


PR00704 


CALPAIN CYSTEINE 
PROTEASE (C2) FAMILY 
SIGNATURE 


PR00704D 11.05 5.909e- 
34 135-161 PR00704F 
13.61 7.000e-26 190- 
218 PR00704E 12.55 
8.071e-26 165-189 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


"RESULTS* 








PR00704B 17.94 2.241e- 
23 75-98 PR00704A 
14.68 4.094e-19 30-54 
PR00704C 11.88 l.B71e- 
18 99-116 


725 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


726 


PR00194 


TROPOMYOSIN SIGNATURE 


PR00194A 7.86 7.652e- 
09 169-187 


727 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320C 13.01 2,125e- 
13 277-292 PR00320A 
16.74 1.310e-ll 277- 
292 PR00320C 13.01 
4.522e-ll 323-338 
PR00320A 16.74 6.586e- 
11 323-338 PR00320B 
12.19 4 .343e-10 323- 
338 PR00320B 12.19 
6.914e-l0 277-292 


731 


PR00195 


DYNAMIN SIGNATURE 


PR00195A 11.94 8.627e- 
16 288-307 PR00195E 
9.82 3.912e-ll 457-474 


733 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.082e- 
10 787-798 


73 8 


BL00039 


DEAD-box subfamily ATP- 
dependent helicases 
proteins. 


BL00039A 18.44 2.565e- : 
28 26-65 BL00039D 
21.67 2.105e-20 338- ! 
384 BL00039C 15.63 
9.100e-13 160-184 
BL00039B 19.19 9.617e- 
11 73-99 


739 


BL01289 


TSC-2 2 / dip / bun 
family proteins. 


BL01289A 12.18 8.909e- 
31 326-353 BL01289B 
10.45 9.571e-17 353- 
3 83 


742 


BL01019 


ADP-ribosylation factors 
family proteins. 


BL01019A 13.20 7.078e- 
12 41-81 


743 


BL00965 


Phosphomannose isomerase 
type I proteins. 


BL00965C 23.78 l.OOOe- 
40 256-305 BL00965B 
17.77 1.600e-25 126- 
153 BL00965A 10.57 
6.400e-19 94-113 


747 


BL00021 


Kringle domain proteins. 


BL00021D 24.56 4.563e- 
25 231-273 BL00021B 
13.33 5.345e-21 60-78 


*7 A Q 

/So 


BL006 12 


Osteonectin domain 
proteins . 


BL00612B 11.35 2.034e- 
11 93-126 


1 A Q 


PRO 04 50 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 6.880e- 
10 135-157 






Involucrin proteins . 


BL00795C 17.06 6.000e- 
11 384-429 BL00795C 
17.06 9.444e-ll 370- 
415 


754 


dt n n n c i 


Ribosomal protein L3 9e 
proteins. 


BL00051 20. -92 1.935e- 1 
16 4-50 


755 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 7.723e- 
09 171-184 


760 


BL01020 


SARI family proteins. 


BL01020C 15.35 9.020e- 
12 99-150 


762 


BL00046 


Histone H2A proteins . 


BL00046 12.95 l.OOOe- 
40 33-88 


7^3 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 9.137e- 
10 206-240 


764 


BL00027 


• Homeobox • domain 
proteins. 


BL00027 26.43 8.800e- 
29 417-460 


767 


BL01208 


VWFC domain proteins . 


BL01208B 15.83 6.063e- 
10 309-324 BL01208B 
15.83 8.031e-10 165- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








180 BL01208B 15.83 
4.l62e-09 85-100 


770 


BL00031 


Nuclear hormones 
receptors DNA- binding 
region proteins . 


BL00031A 19.55 9.571e- 
32 -208-241 BL00031B 
22.25 5.500e-27 242- 
274 


772 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.450e- 
18 4-26 PR00449E 
13.50 3.520e-14 142- 
165 PR00449C 17.27 
3.032e-13 44-67 
PR00449D 10.79 8.579e- 
13 107-121 PR00449B 
14.34 3.455e-ll 27-44 


773 




suiiacasss proteins . 


BL00523E 19.27 9.333e- 
23 299-329 BL00523A 
13.36 2.200e-13 47-64 
BL00523B 8.64 2.607e- 
13 91-103 BL00523D 
9.89 7.923e-12 224-236 
BL00523C 12.64 4.512e- 
10 141-152 BL00523F 
10.85 5.821e-10 373- 
384 


775 


BL0O028 


tn-m- j.j.iiytJi, \,£.ric. uype , 
domain proteins . 


BL0UU28 16.07 /.bBfae- 
09 568-585 


776 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 621-638 


777 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.686e- 
09 595-612 


778 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 8.412e- 
11 322-341 BL00030A 
14,39 7.000e-10 220- 
239 


779 


PR00079 


GLUCOSE- 6 -PHOSPHATE 
DEHYDROGENASE SIGNATURE 


PR00079B 12.98 2.929e- 
26 193-222 PR00079E 
16.65 4.150e-23 348- 
375 PR00079C 8.68 
6.351e-16 246-264 
PR00079D 13.51 7.070e- 
16 264-281 PR00079A 
16.12 6.769e-13 169- 
183 


781 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 9.250e- 
17 10-35 BL00215A 
15.82 6.000e-16 221- 
246 BL00215A 15.82 
7.857e-12 108-133 
BL00215B 10.44 9.526e- 
11 168-181 


783 


PD002B9 


PROTEIN SH3 DOMAIN 
REPEAT PRESYNA. 


PD00289 9.97 6.276e-09 
159-173 


785 


BL00690 


DEAH-box subfamily ATP- 
dependent helicases 
proteins . 


BL00690B 13.38 l.OOOe- 
12 147-165 BL00690A 
6.87 5.320e-10 114-124 
BL00590C 7.51 3.189e- 
09 218-228 


786 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 8.500e- 

13.20 5.235e-14 8-30 
PR00449E 13.50 2.853e- 
11 150-173 PR00449D 
10.79 1.545e-09 111- 
125 


78B 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 8.7^7e- 
10 1-21 


790 


BL00915 


Phosphatidyl inositol 3- 
and 4 -kinases proteins. 


BL00915C 22.43 9.182e- 
39 725-764 BL00915B 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








22.78 5.050e-33 633- 
671 BL00915D 27.02 
1.529e-21 795-831 
BL00915A 10.09 l.OOOe- 
13 395-407 


791 


PR00208 


GLIADIN AND LMW GLUTENIN 
SUPERFAMILY SIGNATURE 


PR00208A 12.59 6.294e- 
10 120-138 PR00208A 
12.59 6.294e-10 121- 
139 PR00208A 12.59 
6.294e-10 122-140 
PR00208A 12.59 6.294e- 
10 123-141 PR00208A 
12.59 6.294e-10 124- 
142 PR00208A 12.59 
6.294e-10 125-143 
PR00208A 12.59 6.294e- 
10 126-144 PR00208A 
12.59 6.294e-10 127- 
145 PR00208A 12.59 
6.294e-10 128-146 
PR00208A 12.59 6.294e- 
10 129-147 PR00208A 
12.59 7.411e-09 130- 
148 PR00208A 12.59 
7.658e-09 131-149 
PR00208A 12.59 7.904e- 
09 132-150 PR00208A 
12.59 8.274e-09 118- 
136 PR00208A 12.59 
8.274e-09 119-137 


795 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 5.034e- 
16 302-320 PR00205A 
14.73 1.257e-ll 284- 
300 PR00205C 13.65 
1.333e-ll 337-352 


796 


BL00412 


Neuromodulin (GAP -43) 
proteins . 


BL00412D 16.54 4.000e- 
12 196-247 BL00412D 
16.54 5.705e-ll 197- 
248 BL00412D 16.54 
7.848e-10 199-250 
BL00412D 16.54 1.827e- 
09 195-246 BL00412D 
16.54 1.918e-09 194- 
245 BL00412D 16.54 
2.102e-09 201-252 


797 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 6.339e- 
13 40-58 


799 


BL01052 


Calponin family repeat 
proteins . 


BL01052C 18.51 1 . OOOe- 
40 87-127 BL01052A 
16.12 1.529e-32 3-35 
BL01052B 15.31 1.257e- 
25 52-78 BL01052D 
10.26 5.737e-25 174- 
194 


800 


BL00348 


p53 tumor antigen 
proteins . 


BL00348F 23.19 3.714e- 
09 197-240 


801 


BL00309 


Vertebrate galactoside- 
binding lectin proteins . 


BL00309C IB. 65 1.621e- 
09 62-87 


802 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245D 10.47 5.224e- 
09 187-199 


804 


PF00774 


Dihydropyridine 
sensitive L-type calcium 
channel (Beta subuni. 


PF00774A 16.47 8.457e- 
10 110-156 


808 


PR00667 


RETINAL PIGMENT 
EPITHELIUM- RETINAL GPCR 
SIGNATURE 


PR00667C 11.71 9.875e- 
09 12-28 


810 


PD02346 


PHOTOS YSTEM II PROTEIN 
PRECURSOR 


PD02346F 12.89 4.340e- 
09 317-354 



218 



WO 01/53312 PCT/US00/34263 



SBQ ID NO : 


NO, 




RESULTS* 






PHOTOSYNTHESIS . 




811 


BL00685 


CBF-A/NF-YB subunit 
proteins . 


BL00685B 14.41 6 . 779e- 
14 54-95 BL00685A 
11.22 4.7986-13 5-54 


812 


PR00080 


ALCOHOL DEHYDROGENASE 
SUPER FAMILY SIGNATURE 


PR00080A 9.32 9.419e- 
10 93-105 


813 


BL00357 


Histone H2B proteins. 


BL00357 7.74 1.988e-17 
22-65 


815 


PD00066 


PROTEIN ZINC-FINGER 
METAL-BINDI . 


PD00066 13.92 7.923e- 
15 158-171 PD00066 
13.92 5.200e-14 46-59 
PD00066 13.92 7.000e- 
14 18-31 PD00066 
13.92 7.000e-l3 130- 
143 PD00066 13.92 
7.500e-13 214-227 
PD00D66 13.92 9.000e- 
13 102-115 PD00066 
13 . 92 4 .429e-l2 186- 
199 PD00066 13.92 
1 . 783e-ll 74 - 87 


816 


BL01195 


Peptidyl-tRNA hydrolase 
proteins. 


BL01195C 20.12 3.348e- 
20 100-139 


820 




Interleukin-10 family 
proteins. 


BL00520A 6.21 6.471e- 
09 1-14 


822 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins. 


BL00972A 11.93 8.113e- 
09 224-242 


825 


ft\U U o / u 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 2.268e- 
10 101-115 


B29 


PD02855 


FLAVOPROTE IN PROTEIN ' " 
DNA/PANTOTHEN. 


PD02855A 18.37 4.732e- 
28 88-124 PD02855B 
8.36 6.478e-09 132-142 


o3U 


PRO 04 05 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 7.000e- 
21 44-62 PR00405C 
19.41 1.000e-13 65-87 
PR00405A 17.71 7.283e- 
13 25-45 


831 


PR00019 


LEUCINE -RICH REPEAT 
SIGNATURE 


PR00019A 11.19 l.OOOe- 
09 47-61 PROO019B 
11.36 1.720e-09 136- 
150 PR00019B 11.36 
3.880e-09 44-58 


832 


PR00011 


TYPE III EGF-LIKE 
S IGNATURE 


PROOOllB 13.08 3.438e- 
16 164-183 PROOOllD 
14.03 6.850e-l6 164- 
183 PROOOllA 14.06 
8.364e-14 164-183 
PROOOllC 24.25 5.415e- 
12 231-260 PROOOllD 
14.03 9.852e-ll 212- 
231 


$34 ' 


PD0 01 Dfi 


PKUIEXN UJjYCOPROTEIN 

PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 232-246 


835 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 4.000e- 
10 290-304 


836 


PD00306 


PROTEIN GLYCOPROTEIN 
PRECURSOR RE. 


PD00306A 10.26 7.000e- 
12 216-230 


837 


DM00215 




UMOOZlb 19.43 3.89Be- 
09 78-111 


839 


PD02784 


PROTEIN NUCLEAR 
R I BONUCLEO PROTE I N . 


PD02784B 26.46 8.302e- 
09 73-116 


840 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700B 16.80 5 . 091e- 
22 369-390 PR00700D 
12.47 5.765e-2l 491- 
510 PR00700C 13.17 
4.750e-14 449-467 
PR00700F 11.18 8.500e- 
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6EQ ID NO: ■ 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








11 538-549 PR00700E 
17.57 3.100e-10 522- 
538 


841 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 
SIGNATURE 


PR00109B 12.27 5.404e- 
13 134-153 


844 


PD02785 


PROTEIN RIBOSOMAL 60S 
L22 RNA- BINDING HEP. 


PD02785B 14.43 l.OOOe- 
40 53-112 PD02785A 
15.23 1.915e-28 8-57 


845 


BL00826 


MARCKS family proteins. 


BL00826C 7. £3 6.738e- 
09 203-230 


846 


BL00518 


Zinc finger, C3HC4 type 
{RING finger), proteins. 


BL00518 12.23 4.429e- 
10 15-24 


849 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 l.OOOe- 
08 340-349 


850 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 6.506e- 
09 12-27 


851 


PD02411 


PROTEIN TRANSCRIPTION 
REGULATION NUCLEAR. 


PD02411 21.89 7.000e- 
16 246-280 


852 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 l.OOOe- 
40 723-778 BL00420B 
22.67 1.321e-38 933- 
988 BL00420B 22.67 
8.457e-28 482-537 
BL00420B 22.67 4.500e- 
27 587-642 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 S.731e- 
23 55-110 BL00420B 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-l5 830-885 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 808- 
819 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1018- 
1029 BL00420C 11.90 
7.955e-10 567-578 


853 


BL00420 


Speract receptor repeat 
proteins domain 
proteins . 


BL00420B 22.67 l.OOOe- 
40 756-B11 BL00420B 
22.67 1.321e-38 966- 
1021 BL00420B 22.67 
8.4576-28 482-537 
BL00420B 22,67 4.500e- 
27 620-675 BL00420B 
22.67 9.625e-27 270- 
325 BL00420B 22.67 
4.205e-26 163-218 
BL00420B 22.67 5.731e- 
23 55-110 BL0042OB 
22.67 6.464e-20 377- 
432 BL00420B 22.67 
2.800e-15 863-918 
BL00420C 11.90 1.900e- 
13 355-366 BL00420C 
11.90 1.900e-12 841- 
852 BL00420C 11.90 
3.550e-12 248-259 
BL00420C 11.90 2.831e- 
11 141-152 BL00420C 
11.90 5.119e-ll 1051- 
1062 BL00420C 11.90 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








7.955e-10 567-578 


857 


PR00388 


3 1 , 5 1 -CYCLIC NUCLEOTIDE 
CLASS II 

PHOSPHODIESTERASE 
SIGNATURE 


PR00388A 10.45 2,778e- 
09 64-83 


859 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 S.929e- 
13 37-56 BL00030B 
7.03 1.900e-ll 167-177 
BL00030A 14.39 2.000e- 
10 128-147 


861 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 4.250e- 
17 23-41 PR00988C 
13.64 8.714e-l6 107- 
123 PR00988F 12.23 
7.828e-15 198-212 
PR00988E 8.27 9.769e- 
12 176-188 PR00988D 
5.95 8.250e-ll 163-174 
PR00988B 11.60 4.512e- 
10 60-72 


863 


BL00215 


Mitochondrial energy- 
transfer proteins. 


BL00215B 10.44 8.071e- 
12 41-54 


864 


PR00775 


90 KD HEAT SHOCK PROTEIN 
SIGNATURE 


PR00775E 8.06 l.OOOe- 
24 198-221 PR00775B 
3.52 1.837e-23 107-130 
PR00775D 8.91 4.484e- 
17 171-189 PR00775A 
9.90 8.342e-17 86-107 
PR00775C 10.68 9.379e- 
17 153-171 PR00775G 
10.64 6.850e-15 267- 
286 PR00775F 12.76 
6.769e-14 249-267 


ODD 


DM0168 8 


2 POLY-IG RECEPTOR. 


DM01688G 16.45 9.460e- 
09 89-121 


q cn 


PD010 66 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 5.596e- 
29 14-53 


DtfQ 

OOg 


BLG1287 


RNA 3 ' -terminal 
phosphate cyclase 
proteins . 


BL01287A 17.95 2.688e~ 
26 16-48 


869 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 6.464e- 
10 304-337 


o /Z 


BL0004 6 


Histone H2A proteins. 


BL00046 12.95 l.OOOe- 
40 30-85 


874 


BL0018B 


Biotin-requiring enzymes 
attachment site 
proteins . 


BL00188 30.29 9.036e- 
32 665-711 


876 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 7.6B6e- 
09 298-315 


877 


PD02102 


SUBUNIT E V-ATPASE 
VACUOLAR ATP SYNTHASE 
HYDROL . 


PD02102A 16\74 4. 17be- 
10 97-141 


879 


BL01189 


Ribosomal protein S12e 
proteins. 


BL01189A 14.27 l.OOOe- 
40 35-71 BL01189B 
13.49 1.000e-40 71-125 


882 


BL00284 


Serpins proteins. 


BL00284C 28.56" 6\400e- 
25 62-104 BL00284B 
17.99 6.182e-12 35-56 


889 


BL00216 


Sugar transport 
proteins. 


BL00216B 27.64 4.375e- 
21 35-85 


896 


PR00391 ! 


PHOSPHATIDYLINOSITOL 
TRANSFER PROTEIN 
SIGNATURE 


PR00391E 12.50 7.785e- 
15 211-231 PR00391B 
8.39 1.000e-13 83-104 
PR00391D 12.21 9.326e- 
13 191-207 PR00391A 
7.83 5.390e-ll 16-36 


897 


PR00327 


ICE NUCLEATION PROTEIN 


PR00327C 6.37 5.247e- 
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SijQ. ID viU : 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


09 313-328 


898 


BL00039 


DEAD-box subfamily ATP- 
dependent he li cases 
proteins . 


BLO0039D 21.67 7.800e- 
26 386-432 BL00039A 
18.44 6.674e-16 113- 
152 BL00039B 19.19 
1.947e-13 153-179 
BL00039C 15.63 9.460e- 
11 236-260 


901 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BINDI . 


PD00066 13.92 8.2O0e- 
16 254-267 PD00066 
13.92 8.200e-16 282- 
295 PD00066 13.92 
8.200e-16 310-323 
PD00066 13.92 8.200e- 
16 366-379 PD00066 
13.92 8.200e-16 394- 
407 PD00066 13.92 
8.200e-14 338-351 


902 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 9.321e- 
11 6-50 


903 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 9.160e- 
09 97-111 


904 


PR00381 


KINESIN LIGHT CHAIN 
SIGNATURE 


PR00381E 8.75 6.586e- 
25 335-356 PR00381B 
18.17 2.667e-24 204- 
224 PR003B1A 9,55 
2.800e-24 107-125 
PR00381C 12.48 4 . 522e- 
24 226-245 PR00381D 
13.94 1.084e-22 291- 
309 PR00381F 9.13 
3 .288e-22 370-392 
PR00381F 9.13 7.181e- 
13 286-308 PR00381E 
8.75 4.066e-ll 251-272 
PR00381E 8.75 7.033e- 

11 293-314 PR00381E 

8 .75 8.364e-10 377-398 
PR003B1D 13.94 5.230e- 

09 333-351 PR00381C 

12 .48 7 .120e-09 310- 
329 


906 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 8.557e- 
09 525-549 


907 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345C 4.54 8.557e- 
09 513-537 


908 


BL0067B 


Trp-Asp (WD) repeat 
proteins proteins . 


BL00678 9.67 9.308e-ll ! 
144-155 


910 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.800e- 
30 48-87 


Q I 0 




Ribosomal protein L13e 
proteins. 


BL01104C 15.14 6.000e- 
09 364-392 


922 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 3.842e-09 
500-511 


923 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320C 13.01 2.500e- 
09 323-338 PR00320C 
13.01 5.500e-09 187- 
202 


924 


PD02181 


PROTOCHLOROPHYLLIDE 
REDUCTASE PHOTOS YNT . 


PD02181D 12.85 8.609e- 
09 36-64 


926 


BL00019 


Actinin-type actin- 
binding domain proteins. 


BL00019C 14.** 7.453e- 
25 108-144 BL00019B 
13.34 6.510e-ll 61-84 
BL00019D 15.33 9.338e- 
11 205-235 BL00019A 
12.56 2.373e-10 34-45 


928 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 9.308e-ll 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


273-284 BL00678 9.67 
1.600e-10 314-325 
BL0067B 9.67 7.600e-10 
Jou-J/X DliUOb to 9.67 
8.579e-09 206-217 


929 


BLO0518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 1.857e- 
10 137-146 


930 


BLO1085 


Ribulose-phosphate 3- 
epimerase family 

nrnh (* inn 


BL01085D 16.55 4.600e- 
24 134-165 BL01085B 
10.15 5.680e-22 30-52 
BL01085E 18.87 8.676e- 

21.81 2.038e-14 66-97 


931 


BLO1085 


Rl hill nCA-nKr»cnVial-o "J 

epimerase family 
proteins . 


BLG1085D 16.55 4.600e- 
24 152-183 BL01085B 
1 fi 1 1| ^ c«n«_oo ?n_co 

BL01085E 18.87 8.676e- 
20 190-220 BL01085C 


933 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM- B I . 


PD00301A 10.24 6.400e- 

r\Q 1 Cf\ 1*71 


336 


PF00168 


C2 domain proteins. 


PF00168C 27.49 4.000e- 
12 336-362 


937 


BL00415 




UbUU41sN 4.29 9.519e- 
10 5-49 


940 


PR00862 


PROLYL OLTnnPFPTTn&cjp 

SERINE PROTEASE (S9A) 
SIGNATURE* 


J?R00862D 16.17 4.086e- 
09 63-84 


945 


BL01230 


RNA methyltransferase 

1* T*mA. fami 1 \r nmha'l r»o 

uj.um laiiiiiy pj.ut.cins . 


BL01230B 11.62 2.373e- 
09 407-420 


948 


BL00479 


Phorbol esters / 

^j-o.y-'y J.y xy t.ti loj, u J- ; iuj. DQ 

domain proteins. 


BL00479B 12.57 7.429e- 
IB 52-68 BL004 79A 
19.86 2.200e-13 26-49 


949 


BL00678 


Trp-Asp (WD) repeat 
f iULcins proteins . 


BL0067B 9.67 1.474e-09 
100-111 


954 


PD01311 


PROTEIN 0XID0REDUCTASE 
NAD INTFRRPNTf PF 


PD01311A 30.23 5.909e- 
10 66-13.1 


955 


PF00651 


BTB (also Jcnown as BR- 

C /Th 1c ) domain nrnt" pi no 


PF00651 15.00 3 ,250e- 


95£ 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.250e- 
12 47-60 


957 


BL00379 


CDP-alcnhnl 

phosphatidyl transferases 
proteins . 


BLiUU-i /a 24.64 1.610e- 

15 111-148 


959 


BL01115 


GTP-binding nuclear 

orotein ran nrnhpinR 


BL01115A 10.22 1.884e- 

J. U Ji" /o 


960 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 3.438e- 
14 110-154 


962 


BL00061 


Short -chain 

dehydrogenases/reductase 
s family proteins . 


BL00061B 25.79 6.586e- 
13 19B-236 


963 


PR00502 


MUTT DOMAIN SIGNATURE 


PR00502A 15.06 8.200e- 
11 210-225 


966 1 


PRnn^np, 

C C\\J ujuO 


liiVh I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308A 5.90 7.035e- 
09 55-70 


96 J 7 


DM01206 


C0R0NAVIRUS NUCLE0CAPSID 
PROTEIN. 


DM01206B 10.69 1.2B6e- 
12 104-124 DM01206B 
10.69 5.299e-ll 23-43 
ui*i UJLZUbo 1U . by o . z / 46* 
10 73-93 DM01206B 
10.69 3.962e-09 108- 
128 DM01206B 10.69 
5.67le-09 38-58 


969 


PF01008 


Initiation factor 2 
subunit . 


PF01008B 25.59 4.724e- 
31 417-460 PF01008C 
12.25 5.333e-18 506- 
526 PF01008A 20.14 
5.875e-15 369-390 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


970 


BL01277 


Ribonuclease PH 
proteins . 


BL01277C 10.18 7.648e- 
10 112-143 BL01277A 
17.39 9.806e-10 40-78 


975 


BL01159 


ww/rsp5/WWP domain 
proteins . 


BL01159 13.85 3.605e- 
12 130-145 BL01159 
13.85 4.122e-10 171- 
186 


977 


PF00791 


Domain present in ZO-1 
and Unc5-like netrin 
receptors . 


PF00791C 20.98 2.235e- 
09 55-94 


978 


BL01167 


Ribosomal protein L17 
proteins . 


BL01167B 20.66 8.258e- 
19 88-127 


979 


BL00478 


LIM domain proteins . 


BL00478B 14.79 9.357e- 
13 33-48 BL00478B 
14.79 7.250e-12 98-113 


980 


PR00312 


CALSEQUESTRIN SIGNATURE 


PR00312E 8.32 3.423e- 
36 169-199 PR00312I 
15.78 5.286e-35 332- 
361 PR00312F 15.06 
5.865e-35 199-229 
PR00312H 13.31 8.313e- 
35 263-291 PR00312J 
13.73 5.688e-34 363- 
392 PR00312D 9.43 
2.636e-33 128-158 
PR00312C 15.14 8 .839e- 
33 92-122 PR00312B 
15. 0B 8.941e-33 62-92 
PR00312G 11.11 6.657e- 
32 230-258 PR00312A 
11.70 6.914e-27 35-59 


9B1 


PF00992 


Troponin . 


PF00992A 16.67 8 . 816e- 
09 414-449 


982 


PR00299 


ALPHA CRYSTALLIN 
SIGNATURE 


PR00299F 13.20 2.367e- 
09 127-149 


983 


BL01150 


Respiratory- chain NADH 
dehydrogenase 20 Kd 
subunit proteins. 


BL01150B 17.16 l.OOOe- 
40 156-202 BL01150A- 
14.10 8.200e-39 100- 
138 


986 


BL00795 


Involucrin proteins . 


BL00795C 17.06 7.211e- 
14 4-49 BL00795C 
17.06 1.778e-ll 1-46 
BL00795C 17.06 3.407e- 
10 14-59 BL00795C 
17.06 7.802e-10 2-47 
BL00795C 17.06 8.640e- 
10 19-64 BL00795C 
17.06 7.400e-09 11-56 
BL00795C 17.06 7.800e- 
09 3-48 


Q Q "7 

2tO 1 


oLiU u y j y 


Ribosomal protein Lie 
proteins . 


BL00939F 17.27 5.393e- 
09 810-840 


QOD 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 525-541 


989 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 6.538e- 
11 497-513 


994 


BL00027 


1 Homeobox 1 domain 
proteins. 


BL00027 26.43 2.500e- 
25 146-189 


997 


BL01304 


ubiH/COQS monooxygenase 
family proteins. 


BL01304A 8.05 3.893e- 
11 65-79 


998 


DM01767 


5 TRANSMITTER DOMAIN. 


DM01747B 10.07 7.868e- 
09 22-39 


1000 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926C 16.07 1.750e- 
24 73-94 PR00926D 
10.53 3.250e-23 126- 
145 PR00926F 17.75 
6.211e-23 217-240 
PR00926E 11.70 6.625e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 1 








20 174-193 PR00926B 
16.07 2.125e-18 24-39 
PR00926A 10.41 l.OOOe- 
15 11-25 PR00926F 
17.75 5.565e-09 120- 
143 


1005 


BL00406 


Actins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406D 12.58 3.700e- 
4 0 270-325 BL004 06E 
8.44 7.375e-38 327-377 
BL00406A 9.95 3.348e- 
29 11-46 


1006 


BL00406 


Actins proteins. 


BL00406B 5.47 l.OOOe- 
40 88-143 BL00406C 
6.75 1.000e-40 147-202 
BL00406E 8.44 l.OOOe- 
35 248-298 BL00406A 
9.95 3.348e-29 11-46 


1007 


PR00304 


TAILLESS COMPLEX 

eviii pop i luh l 

( CHAPERONE ) S IGNATURE 


PR00304D 11.04 8.714e- 
22 384-407 PR00304C 
8.69 4 .667e-20 98-118 
PR00304B 11.60 7.577e- 
19 68-87 PR00304A 
9.20 3.382e-16 46-63 
PR00304E 7.79 6.870e- 
13 418-431 


1009 


rJJUlUbb 


VKUiblN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 2.929e- 
32 9-48 


1011 


PD01066 


PROTEIN ZINC FINGER 
£IJNC- r lW(jh,K Mbl AL- 
BINDING NU. 


PD01066 19.43 2.929e- 
32 68-107 


1012 


BL0051B 


Zinc finger, C3HC4 type 
vKiiNvj ringer/ / proteins. 


BL00518 12.23 6.143e- 
10 64-73 


1016 


PD01168 


SYNTHETASE LIGASE 
fKUl-LlIN AJjAJNIiJj. 


PD01168H 12.08 l.OOOe- 
11 174-194 


1018 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 1.391e- 
32 261-302 PD00930A 
25.62 9.550e-22 157- 
183 


1022 


BL00175 


Phosphogly cerate mutase 
family phosphohist idine 
proteins . 


BL00175A 15.42 5.179e- 

12 6-26 BL00175C 

23 .75' 8.062e-10 79-111 


1025 


PR00305 


-L*x — J — j rrrvL/ 1 & J.1M CtCii.i\ 

SIGNATURE 


FKUUiUbD 16.34 1.439e- 
10 158-185 


102^ 


BL00353 


HMG1/2 proteins. 


BL00353B 11.47 2.436e- 
18 238-288 BL00353C 
14.83 8.844e-ll 288- 
335 


1028 


BL00183 


Ubiqui tin-conjugating 
enzymes proteins. 


BL00183 28.97 1.310e- 
33 43-91 


1033 


PF00580 


UvrD/REP helicase. 


PF00580A 13.37 4.720e- 
09 111-133 


1034 


PR00413 


HALOACID 

DEHALOGENASE/ E POXIDE 
HYDROLASE FAMILY 
SIGNATURE 


PR00413E 15. 7B 3.429e- 
09 154-171 


1037 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 9.657e- 
09 5-44 


1038 


PD01796 


PROTEIN TRANSMEMBRANE 
COBALT ZINC CADMIU. 


PD01796 15.01 4.259e- 
11 55-82 


1039 


BL00299 


Ubiqui tin domain 
proteins . 


BL00299 28.84 9.036e- 
09 17-69 


1040 


PR00970 


ARGININE ADP- 
RIBOSYLTRANSFERASE 


PR00970A 17.73 6.143e- 
20 56-7B PR00970D 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






SIGNATURE 


9.96 2.154e-18 154-171 
PR00970F 12.30 l.OOOe- 
16 224-241 PR00970G 

9.97 9.229e-15 242-258 
PR00970B 16.37 1.290e- 
13 86-105 PR00970C 
11.05 1.643e-ll 115- 
130 PR00970E 11.23 
9.820e-ll 202-218 


1042 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL0067B 9.67 2.200e-10 
243-254 


1043 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 6*.786e- 
13 114-128 PR00048A 
10,52 1.000e-09 172- 
186 


1045 


BL00615 


C-type lectin domain 
proteins . 


BL00615A 16.68 1.720e- 
11 218-236 BL00615B 
12.25 1.857e-10 317- 
331 


1046 


BL01092 


Adenylate cyclases 
class-I proteins. 


BL01092N 13.54 8.924e- 
10 3-40 


1047 


BLD1216 


ATP-citrate lyase / 
succinyl-CoA ligases 
family proteins. 


BL01216D 21.75 4.316e- 
28 314-344 BL01216A 
13 .91 1.000e-10 97-112 


1049 


DM00031 


IMMUNOGLOBULIN V REGION. 


•DM00031B 15.41 7.618e- 
12 102-136 


1050 


BL01073 


Ribosomal protein L24e 
proteins . 


BL01073 24.30 l.OOOe- 
40 12-62 


10*4 


BL00571 


Amidases proteins. 


BL00571 25.69 5.87Se- 
31 160-212 


1055 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 5.235e- 
11 98-117 BL00030B 
7.03 4.316e-09 137-147 


1058 


BL00223 


Annexins repeat proteins 
domain proteins . 


BL00223C 24.79 8.754e- 
23 262-317 BL00223A 
15.59 9.478e-14 46-80 
BL00223A 15.59 5.557e- 
11 118-152 


1060 


BL00027 


1 Homeobox ' domain 
proteins . 


BL00027 26.43 3.455e- 
35 158-201 


1064 


BL00455 


Putative AMP-binding 
domain proteins . 


BL00455 13.31 6.211e- 
13 280-296 


1065 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 2.000e- 
09 115-129 PR00019B 
11.36 3.880e-09 87-101 


1066 


PR00326 


GTP1/OBG GTP- BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 4.600e- 
16 151-172 PR00326C 
9.79 1.290e-14 200-216 
PR00326B 16.74 8.548e- 
14 172-191 PR00326D 
19.09 1.257e-13 217- 
236 


1071 


PDD2870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 B.518e- 
11 164-197 


1072 


PF00856 


SET domain proteins. 


PF00856A 26.14 5.976e- ! 
09 350-387 


1075 


BL01009 


Extracellular proteins 
SCP/Tpx- 1 /Ag5/PR- 1/Sc7 
proteins . 


BL01009D 14.19 4.300e- 
20 127-148 BL01009A 
13.75 6.586e-13 57-75 
BL01009E 13.50 1.439e- 
11 159-175 


1077 


PR00724 


CARBOXYPEPTIDASE C 
SERINE PROTEASE (S10) 
FAMILY SIGNATURE 


PR00724A 10.91 l.OOOe- 
08 366-379 J 


'1078 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 l.OOOe- 
12 170-195 BL00215A 
15.82 7.523e-10 79-104 


1079 


BL00678 


Trp-Asp (WD) repeat 


BL00678 9.67 4.3l6e-09 | 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






proteins proteins. 


298-309 


1081 


BL00326 


Tropomyosins proteins . 


BL00326A 14.01 7.398e- 
10 23-57 


1094 


BL00460 


Glutathione peroxidases 
selenocysteine proteins . 


BL00460A 28.67 3.204e-" " 
IB 57-92 BL00460B 
9.73 6.400e-13 100-118 
BL00460D 16.89 9 . 143e- 
12 162-182 BL00460C 
14.35 5.500e-09 133- 
156 


1095 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN. 


PD02811A 20.67 3.017e- 
22 67-105 PD02811B 
17.07 2.263e-21 118- 
151 PD02811C 13.25 
5.696e-13 154-167 


1096 


PD02811 


PROTEIN PEPTIDE 
REDUCTASE MG448 PILB 
FIMBRIA TRAN . 


PD02811A 20.67 3.017e- 
22 60-98 PD02811B 
17.07 2 .263e-21 111- 
144 PD02811C 13.25 
5.696e-13 147-160 


1097 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479B 12.57 6.143e- 
09 200-216 


1105 


PF00881 


Nitroreductase family. 


PF00881A 27.15 9.229e- 
13 111-147 


1109 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 3.077e- 
10 15-37 PR00449E 
13.50 l,857e-09 185- 
208 PR00449D 10.79 
8.364e-09 131-145 


1115 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.737e- 
20 42-60 PR00405A 
17.71 2.703e-17 23-43 
PR00405C 19.41 6.902e- 
10 63-85 


1116 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.B28e-25 
20-51 


1117 


BL00355 


HMG14 and HMG17 
proteins . 


BL00355 5.97 2.528e-25 
20-51 


1120 


BL001Q7 


Protein kinases ATP- 
binding region proteins. 


BL00107B 13.31 4.857e- 
10 290-306 


1123 


PR00412 


EPOXIDE HYDROLASE 
SIGNATURE 


PR00412F 18.76 9.526e- 
12 301-324 


1125 


PR00186 


HEMERYTHRIN SIGNATURE 


PR00106A 13.62 2.800e- 
09 87-101 


1129 


BL00170 


Cyc lophi 1 in - type 
peptidyl -prolyl cis- 
trans isomerase 
signatur . 


BL00170C 18.49 3. 077b" 1 
33 84-129 BL00170B 
20.97 6.838e-25 37-77 
BL00170A 17.08 3.455e- 
15 10-37 


1131 


BL00636 


Nt-dnaJ domain proteins. 


BL00636A 8.07 5.304e- 
15 29-46 BL00636B 
15.11 1.360e-14 59-80 


1132 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1133 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 6.211e-09 
29-40 


1136 


BL00990 


Clathrin adaptor 
complexes medium chain 
proteins . 


BL00990C 18. 7B 4.176e- 
38 235-269 BL00990A 
21.44 4.316e-36 94-132 
BL00990B 20.15 2.125e- 
27 157-187 BL00990D 
16.13 5.320e-18 403- 
422 


1137 


PR00314 


CLATHRIN COAT ASSEMBLY 
PROTEIN SIGNATURE 


PR00314B 15.68 S.OOOe- 
34 100-128 PR00314D 
9.66 3.531e-33 233-261 
PR00314C 16.05 8.909e- 
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CEO TD NO « 


NO. 


UIiOLKl IT 1 XUN 


RESULTS * 








TO ICO loo nn n m 1 a y\ 

SZ Iby-Xoo PR00314A 
14.53 1.28le-22 13-34 


1139 


BL01115 


GTP-binding nuclear 
protein ran proi.exns . 


BL01115A 10.22 6.364e- 
13 13-57 


1141 


BL00107 


Protein kinases ATP- 
binding region proteins. 


BL00107A 18.39 4 . OOOe- 
19 451-482 BL00107B 
13.31 3.077e-12 519- 


1148 


PR00685 


TRANSCRIPTION INITIATION 
FACTOR I IB SIGNATURE 


PR00685A 13.62 4.676e- 
09 21-42 


1155 


PD01652 


RECEPTOR CELL NK 

T VPnDDnTTT T NT T MMTTMAP T i^O 
wJu I LUrKU i Jc. XJN XMf'JUWUljJjVjB . 


PD01652B 8.50 9.39fee- ' 
10 522-574 PD01652B 
8.50 9.463e-10 740-792 


1157 




ri i DKULlAci ct N4- PRECURSOR 

PROTEIN SIGNAL BE. 


PD02894A 21.96 7.873e- 
28 81-127 PD02894B 
13.93 1.188e-27 178- [ 
211 


1159 


BL00623 


GMC oxidoreductases 
proteins , 


BL00623E 15.00 3 . 53le- 
20 391-414 BL00623C 
10.86 4.240e-20 155- 
176 


1161 


PD01937 


VvU\ rKU l£*lfN POJj YMCiRASE 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475e- 
09 330-341 


1162 


nnn i Q"3 "7 
fUU i JJ / 


UNA PROTEIN POLYMERASE 
ENDONUCLEASE DNA- . 


PD01937A 6.68 3.475e- 
09 221-232 


1163 


PR00624 


HI STONE H5 SIGNATURE 


PR00624D 11.94 7.455e- 
10 214-239 PR00624D 
11.94 1.961e-09 312- 
337 


1167 


BL00226 


Intermediate filaments 
proteins . 


BL00226B 23.86 7.384e- 
09 302-350 


XX 1 l 


BL01032 


Protein phosphatase 2C 
proteins . 


BL01032G 8.33 1.422e- 
10 34-48 


1 1 TO 

XX fa 


PR003 2 0 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320A 16.74 1.794e- 
10 205-220 PR00320C 
13.01 7.840e-10 205- 
220 PR00320B 12.19 
8.457e-10 35-50 
PR00320A 16.74 7.146e- 
09 35-50 PR00320B 
12.19 9.100e-09 79-94 


1180 


PR00454 


ETS DOMAIN SIGNATURE 


PR00454D 10.89 4.150e- 
19 765-784 


1 i on 




Prion protein. 


BL00291A 4.49 8.962e- 
11 152-187 


1184 


DJjU U / £ u 


Guanine-nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 4.103e- 
18 1089-1113 


1185 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15.82 4.553e- 
13 204-229 BL00215A 
15.82 1.429e-12 11-36 
BL00215A 15.82 9.809e- 
11 104-129 


1187 


BL00983 


Ly-6 / u-PAR domain 
proteins . 


BL00983C 12.69 2.7£le- 
10 77-93 


1188 


BL00878 


Orn/DAP/Arg 

decarboxylases family 2 
pyridoxal-P attachment 

SI . 


BL00878B 10.95 6.000e- " 
16 189-204 BL00878C 
17.74 8.435e-15 225- 
245 BL00878F 19.67 
3 .625e-13 379-402 
BL00878D 16.56 1.621e- 
09 270-289 


1191 


PD0293 9 


PROTEIN GLUTATHIONE 
SYNTHETASE SY. 


PD02939B 10.10 2.723e- 
12 203-220 PD02939C 
20.01 1.000e-ll 224- 
252 


1193 


PR00345 


STATHMIN FAMILY 
SIGNATURE 


PR00345B 7.12 2.800e- 
28 72-101 PRO0345E 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION ' 


RESULTS* 








8.54 7.652e-28 149-174 
PR00345C 4.54 9.100e- 
28 101-125 PR00345D 
10.97 1.964e-24 125- 
149 PR00345A 13.46 
5.645e-16 43-62 


1194 


Pfto6!j4£ 


sTATtMfo FAMILY 
SIGNATURE 


PR0034BB 7.12 2.800e- 
28 108-137 PR00345E 
8.54 7.652e-28 185-210 
PR00345C 4.54 9.100e- 
28 137-161 PR00345D 
10.97 1.964e-24 161- 
185 PR00345A 13 .46 
5.645e-16 79-98 


1195 


PF00995 


Seel family. 


PF00995B 17.37 1.120e- 
13 224-264 


1196 


BL00982 


Bacterial -type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 6.738e- 
11 15-47 


1197 


BL01298 


Dihydrodipicolinate 
reductase proteins. 


BL01296A 13.90 5.959e- 
09 51-73 


1203 


BL00061 


Short-chain 

dehydrogenases /reductase 
s family proteins . 


BL00061B 25.79 l.OOOe- 
14 152-190 


1204 


PR00118 


BETA- LACTAMASE CLASS A 
SIGNATURE 


PR00118F 16.42 9.386e- 
09 213-229 


1206 


BL01183 


ubiE/COQ5 

methyltransf erase family 
proteins . 


BL01183B 21.31 1.429e- 
37 184-229 BL01183D 
27.71 8.535e-27 264- 
307 BL01183A 13.25 
3.250e-23 51-73 
BL01183C 10,77 5.295e- 
09 246-258 


1208 


BL00979 


G-protein coupled 
receptors family 3 
proteins . 


BL00979L 20.63 2.485e- 
09 105-146 


1209 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 4.857e- 
11 49-65 PF00023B 
14 .20 1.818e-09 45-55 


1212 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e- 
14 227-241 PR00048A 
10.52 4.316e-ll 199- 
213 


1213 


PR00450 


RECOVERIN FAMILY 
SIGNATURE 


PR00450C 12.22 1.720e- 
10 20-42 PR00450C 
12 .22 3 .506e-09 56-78 
PR00450D 16.58 6.769e- 
09 44-64 


1216 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 5.598e- 
10 179-230 


1219 


PR00456 i 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR60456E 3.0£ 5.348e-" 
11 249-264 


1222 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BINDI . 


PD00066 13.92 7.23le- 
15 295-308 PD00066 
13.92 7.231e-15 406- 
419 PD00066 13,92 
2.286e-12 378-391 
PD00066 13.92 7.857e- 

12 434-447 PD00066 

13 .92 3.348e-ll 350- 
363 


1223 


BL50058 


G-protein gamma subunit 
profile. 


BL50058 27.23 l.OOOe- 
40 13-61 


1226 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 8.439e- 
09 279-330 


1227 


BL00437 


Catalase proximal heme- 
ligand proteins. 


BL00437A 18.82 l.OOOe- 
40 49-101 BL00437B 
16.28 1.000e-40 114- 
168 BL00437C 21.86 
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ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 








1.000e-40 190-239 
BL00437D 25.72 1 . OOOe- 
40 248-301 BL00437E 
23.95 1.000e-40 327- 
379 


1230 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 8.297e- 
10 6-60 


1231 


PR00735 


GLYCOSYL HYDROLASE 
FAMILY 8 SIGNATURE 


PR00735A 11.19 6.857e- 
09 391-405 


1232 


PRO 04 97 


NEUTROPHIL CYTOSOL 
FACTOR P4 0 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1233 


PR00497 


NEUTROPHIL CYTOSOL 
FACTOR P4 0 SIGNATURE 


PR00497A 6.92 5.553e- 
10 158-176 


1235 


BL00866 


Carbamoyl -phosphate 
synthase subdomain 
proteins . 


BL00866B 36.29 2.776e- 
09 75-121 


1237 


BL00027 


1 Home obox 1 doma i n 
proteins . 


BL00027 2^.43 1.818e- 
21 36-79 


1243 


PR00403 


WW DOMAIN SIGNATURE 


PR00403B 12.19 1.184e- 
11 10-25 


1246 


PD01168 


SYNTHETASE LIGASE 
PROTEIN ALANYL. 


PD01168L 9.47 2.837e- 
10 31-46 PD01168L 
9.47 4.490e-10 174-189 
PD01168L 9.47 7.612e- 
10 183-198 


1249 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 2.800e-10 
183-196 


1254 


BL001B3 


Ubiqu it in -conjugating 
enzymes proteins. 


BL00183 28.97 2.440e- 
36 96-144 


1255 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 8-52 


1256 


BL00373 


Phosphoribosylglycinamid 
e formyl transferase 
proteins . 


BL00373C 10.35 3.348e- 
12 143-156 


1258 


PR00011 


TYPE III EGF-LIKE 
SIGNATURE 


PR00011B 13.08 3.217e- 
10 174-193 


1259 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 8.286e- 
10 31-40 


1261 


PR00070 


DIHYDROFOLATE REDUCTASE 
SIGNATURE 


PR00070D 11.63 l,000e- 
15 112-127 PR00070C 
13.09 9.500e-15 51-63 
PR00070A 12.92 5.500e- 
12 16-27 


1262 


BL00462 


Gamma - 

glutamyl transpeptidase 
proteins. 


BL00462A 20.89 6.438e- 
24 140-183 BL00462B 
17.88 5.500e-20 230- 
267 BL00462C 27.41 
2.023e-ll 292-347 


1263 


BL00038 


Myc-type, 'helix- loop- 
helix' dimerization 
domain proteins. 


BL00038B 16.97 9.455e- 
11 62-83 


1264 


BL01115 


GTP-binding nuclear 
protein ran proteins. 


BL01115A 10.22 5.670e- 
11 17-61 


1266 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00B37C 17.21 2.714e- 
18 165-182 PR00837A 
14.77 4.512e-12 86-105 
PR00837D 11.12 7.577e- 
12 201-215 


1269 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 9.308e- 
22 40-63 PR00449E 
13.50 1.000e-16 137- 
160 PR00449D 10.79 
3.520e-ll 102-116 


1270 


BL00276 


Channel forming colicins 
proteins. 


BL00276A 8.87 l.SOOe- 
09 17-29 


1275 


PD02327 


GLYCOPROTEIN ANTIGEN 
PRBCURSOR IMMUNOGLO. 


PD02327C 15.47 9.769e- 
09 228-243 


1276 


PR00412 


EPOXIDE HYDROLASE 


PR00412B 12.59 7.894e- 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 






SIGNATURE 


12 119-135 PR00412C 
11.30 1.857e-ll 165- 
1 /y PR0Q412A 13.23 
3 .400e-ll 100-119 


1277 




Pi i h a t" i \Tf =i e* c t" pra 
ruudlivc coLcicloc < 


FrUU/DoL 14 . XZ y.bJoe- 
10 127-157 


1279 


BL00134 


Serine proteases, 

Hwmoin f ami 1 \i 
ULyfJoXH JLcUItXXy, 

histidine proteins. 


BL00134A 11.96 9.325e- 
1J l*io-14b 


1280 


DjJw x £ ^ u 


rnuBpnatiayietnaiiyiainine 

-binding protein family 


BL0122QC 14.75 9.34Be- 
15 248-276 


128S 


BL00S18 


Zinc finger, C3HC4 type 
ikxinvj tiiiyer / , protema . 


BL00518 12.23 2.286e- 
1U 


1287 


PF00791 


Domain pnesent in ZO-1 
and Unc5-like netrin 


PF00791B 28.49 7.182e- 
11 2B8-343 


1292 


PR00802 


SERUM ALBUMIN FAMILY 


PROOB02B 16.51 1.610e- 
10 81-105 


1297 


PR00716 


M- PHASE INDUCER 
PHOSPHATASE SIGNATURE 


PR00716C 17.65 5.696e- 
09 23-44 


1298 


BL00478 


LIM. domain proteins. 


BL00478B 14.79 6.478e- 
14 268-283 




BL0012 7 


Pancreatic ribonuclease 
family proteins. 


BL00127C 31.49 3.571e- 
28 82-126 BL00127B 
26.57 8.800e-28 23-68 


1302 


PR00637 


TYPE 3 BOMBESIN RECEPTOR 
SIGNATURE 


PR00637E 11.27 4 .250e- 
09 290-306 


1307 


BL00215 


Mitochondrial energy 
transfer proteins. 


BL00215A 15. B2 5.500e- 
17 13-38 BL00215A 
15.82 1.000e-16 226- 
251 BL00215A 15.82 
2.658e-13 107-132 


1308 


PR00898 


VASOPRESSIN V2 RECEPTOR 

CTPMATTIOE 1 

olljiNAl UKfcj 


PR00898H 11.34 4.682e- 
09 552-572 


1309 


PD00301 


PROTEIN REPEAT MUSCLE 
CALCIUM-BI . 


PD00301B 5.49 2.73le- 
09 390-401 






Ly-6 / u-PAR domain 
proteins. 


BL00983C 12.69 9.654e- 
13 73-89 BL00983B 
8.19 3.132e-09 12-22 


1313 


BL00194 


Thioredoxin family 
proteins. 


BL00194 12.15 l.dOOe- 
11 15-28 


1314 




Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 8.969e- 
10 53-97 


J.J1D 


rSLiU Ul J 4 


Serine proteases, 
trypsin family, 
histidine proteins. 


BL00134A 11.96 9.325e- 
13 128-145 


1320 


RT.on.7P. ^ 


Kioosomai protein L13 
proteins . 


BL00783C 22.43 6.559e- ' 
24 87-117 BL00783A 
14.55 1.600e-19 8-33 
BL00783B 12.76 3.500e- 
12 74-86 


1327 




Armadillo/be ta- catenin- 
like repeat proteins. 


PF00514A 31.30 7.268e- 
11 82-120 


1329 


BL00030 


Eukaryotic RNA- binding 
region RNP-1 proteins. 


BL00030A 14,39 6.294e- 
11 129-148 BL00030B 
7.03 4.789e-09 168-178 




OYi r\ >1 O *1 

PKUU4 9 7 


NEUTROPHIL CYTOSOL 
FACTOR P 4 0 SI GNATUR E 


PR00497A 6.92 7..239e- 
09 2 5-43 


1332 


PR00161 


NICKEL- DEPENDENT 
HYDROGENASE/B-TYPE 
CYTOCHROME SIGNATURE 


PR00161C 9.51 4.930e- 
09 317-337 


1333 


PD01066 


PROTEIN ZINC FINGER 
2 INC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.769e- 
33 10-49 


1336 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 2.200e- 
09 262-2B1 


1337 


PR00700 


PROTEIN TYROSINE 


FR00700D 12.47 2.200e- 
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NO. 


DESCRIPTION 


RESULTS* 






PHOSPHATASE SIGNATURE 


09 211-230 


1340 


PR00860 


VERTEBRATE 

METALLOTHIONEIN 

SIGNATURE 


PR00860A 5 afi 5 n."*A#*_ 
13 5-18 


1341 


BL00893 


mutT domain proteins. 


BL00893 18.99 6.750e- 
16 46-71 


1343 


BL01282 


BIR repeat proteins. 


BL012B2B 30.49 5.974e- 
21 383-422 


1344 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE. 


DM00099B 14.73 8.313e- " 
09 417-427 


1345 


BL00923 


Aspartate and glutamate 
racemases proteins . 


BL00923B 11.41 5.935e- 
10 135-146 


1348 


PF00651 


BTB (also lennun ran rd _ 

C/Ttk) domain proteins. 


rrUUbbl lb. 00 7.231e- 
13 44-57 


1350 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 3.571e- " 
32 416-445 PR00193C 
12.60 6.318e-3l 179- 
207 PR00193B 11.69 
3.571e-24 133-159 
PR00193E 19.47 9.069e- 
22 470-499 PR00193A 
15.41 1.783e-20 77-97 


1352 


PR00447 


NATITRAT. RP<»T<;TflWrR- 

ASSOCIATED MACROPHAGE 
PROTEIN SIGNATURE 


FKU044 /E 9.73 1.554e- 
15 299-319 PR00447D 
13.54 3 .408e-15 200- 
224 PR00447A 12.73 
6.357e-ll 97-124 
PR00447G 6.69 9.877e- 
10 353-373 


1353 


BL00303 


o- iuu/ j-L-auf type caicium 
binding protein. 


BL00303A 21.77 6.667e- 
26 45-82 BL00303B 
26,15 l.ODOe-24 93-130 


1355 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 5.950e- 
29 375-421 BL00039A 
18.44 7.136e-29 99-138 
BL00039C 15.63 4.000e- 
18 225-249 BL00039B 
19.19 3.182e-14 141- 
167 


1357 


PF00615 


Regulator of G protein 
signalling domain 
proteins. 


PF00615B 16.25 2.216e- 
12 84-101 PF00615C 
10.06 8.412e-12 162- 
176 


1360 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19,43 9.234e- 
29 10-49 


1361 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMO! 7 FfcMTT.Y 
SIGNATURE 


PR00925A 5.47 5.091e- 
lo 14-29 PR00925B 
3.73 6.143e-14 29-42 
PR00925C 5.57 4.789e- 
12 53-64 PR00925D 
6.56 1.857e-10 76-87 


1362 


"BL012 72 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e- 
30 136-171 BL01272C 
11.68 3.314e-25 249- 
274 BL01272A 6.49 
1.231e-18 99-117 


1363 


BL01272 


Glucokinase regulatory 
protein family proteins. 


BL01272B 19.61 6.870e- 
30 113-148 BL01272C 
11.68 3.314e-25 226- 
251 BL01272A 6.49 
1.231e-18 76-94 


1364 


DM0O179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13 .97 5.304e- 
09 167-177 


13 6 8 


PROOFS 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.592e- 
09 76-96 


1370 


PR00988 


URIDINE KINASE SIGNATURE 


PR00988A 6.39 1.794e- 
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SEQ ID NO: ~ 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








10 1-19 


1371 


BLO0242 


Integrins alpha chain 
proteins . 


BL00242B 8.13 8.6l5e- 
09 469-479 


1372 


PR00625 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00625B 13.48 7.353e-~ 
19 46-67 PR00625A 
12.84 1.391e-16 14-34 


1373 


BL00434 


HSF-type DNA-bindihg 
domain proteins. 


BL00434C 23.85 3.770e- 
09 90-130 


1374 


PR00962 


LETHAL {2} GIANT LARVAE 
PROTEIN SIGNATURE 


PR00962C 8.00 6.337e- 
09 505-526 


13 75 


PD024 75 


MUCIN EPITHELIAL TUMOR- 
ASSOCIATE . 


PD02475A 23.18 8.552e- 
10 1111-1150 


1376 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01065 19.43 9.571e- 
32 24-63 


1380 


BL00194 


Thioredoxin family 
proteins . 


BL00194 12.16 6.333e- 
12 48-61 


1381 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 1.458e- 
15 1123-1136 


1363 


BLOODS 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9 67 7 fiOflp-i n 
243-254 


1384 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 7.600e-10 
271-282 


1385 


BL0O3G3 


S-100/ICaBP type calcium 
binding protein. 


BL00303B 26.15 6.203e- 
10 95-132 


1386 


BL01160 


Kinesin light chain 
repeat proteins . 


DJV1J.OUO 3 . UfJze- 

09 1574-1628 


1387 


BL00518 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


fluuuDio L£.£$ x . t/uue— 
11 52-61 


1389 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 3.600e- 
30 10-49 


1390 


PD01066 


PROTEIN ZINC FINGER. 
ZINC-FINGER METAL- 
BINDING NU. 


PD01066 19.43 3.512e- 
31 32-71 


1392 


PR00308 


TYPE I ANTIFREEZE 
PROTEIN SIGNATURE 


PR00308C 3.83 9.723e- 
10 127-137 


1393 


PR003B0 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.625e- 
25 88-110 PR00380D 
9.93 2.406e-20 304-326 
PR00380B 12.64 4.414e- 
16 208-226 PR00380C 
13;18 6.538e-16 243- 
*3 a "> 


1394 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BIND I . 


PD00066 13.92 3.400e- 
14 462-475 PD00066 
J. J . y z o.oUUe-Kl 348- 
361 PD00066 13.92 

7.3 (iC"i,6 ^Uj-^IH * 

PD00066 13.92 6.087e- 
±± *±^u — juo rUUUUbb 
13.92 B.043e-ll 320- 
333 


1398 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 6.786e- 
32 10-49 


1400 


DM01206 


CORONAVIRUS NUCLEOCAPSID 
PROTEIN. 


DM01206B 10.69 7.038e- 
09 270-290 


1406 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930A 25.62 7.324e- 
15 363-389 


1407 


BL00030 


Eukaryo t i c RNA- binding 
region RNP-1 proteins. 


BL00030A 14.39 7.500e- 
10 457-476 


1408 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019A 11.19 9.550e- 
11 179-193 PR00019A 
11.19 8.826e-10 228- 
242 PR00019B 11.36 
1.360e-09 199-213 
PR00019B 11.36 4.960e- 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








09 176-190 


1409 


PRO 05 10 


NEBULIN SIGNATURE 


PR00510A 9.09 4.150e- 
12 182-202 PR00510B 
12.96 8.767e-12 210- 
230 PR00510F 9.88 
8.172e-10 58-75 
PR00510D 9.21 2.367e- 
09 251-267 


1410 


PDOO078 


REPEAT PROTEIN ANK 
NUCLEAR ANKYR. 


PD00078B 13.14 5.696e- 
09 31-44 


1 A 1 O 


BL00358 


Ribosomal protein L5 
proteins . 


BL00358B 22.7^ l.OOOe- 
40 57-103 BL00358C 
13.75 6.087e-14 122- 
136 BL00358D 14.26 
5.500e-13 143-158 
BL00358A 13.06 1.931e- 
11 33-44 


1 A - } A 
XI 14 




Kazal serine protease 
inhibitors family 
proteins . 


BL00282 16.88 7.338e- 
10 511-534 


1415 


BL00023 


Type II fibronectin 
collagen-binding domain 
proteins . 


BL00023 24.31 4.300e- 
29 40-77 


1417 


PR00581 


RIBOSOMAL PROTEIN SI 
SIGNATURE 


PR00681G 12 .54 2.149e- 
09 38-60 


1418 


DM00973 


3 Jew RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973A 21.17 1.462e- " 
09 171-208 


1419 


PR00319 


BETA G- PROTEIN 

{ TRANSDUC I N ) SIGNATURE 


PR00319B 11.47 1.571e- 
09 428-443 


1420 


PD01941 


TRANSMEMBRANE 
COTRANS PORTER SYMP . 


PD01941A 14.81 l.OOOe- 
40 142-196 PD01941B 
15.02 7.049e-30 400- ! 
447 PD01941E 15.92 
2.475e-20 817-864 
PD01941C 19.96 3.118e- 
19 488-543 PD01941D 
27.18 9.614e-18 641- 
690 PD01941F 28.52 
5.382e-15 1038-1093 


1422 


PR00205 


CADHERIN SIGNATURE 


PR00205B 11.39 8.043e- 
12 199-217 


1423 


PR00209 


ALPHA/ BETA GLIADIN 
FAMILY SIGNATURE 


PR00209B 4.88 £.318e- 
11 1009-1028 


1424 


BL50002 


Src homology 3 (SH3) 
domain proteins profile. 


BL50002A 14.19 8.200e- 
14 367-386 BL50002A 
14 .19 9.250e-12 298- 
317 BL50002A 14.19 
4.462e-ll 208-227 
BL50002B 15.18 1.000a- 
09 244-258 


1425 


PF00628 


PHD-f inger. 


PF00628 15.84 3.045e- 
12 330-345 


1426 


PF00628 


PHD- finger . 


PF00628 15.84 3.045e- 
12 377-392 


1427 


PR00405 


HIV REV INTERACTING 
PROTEIN SIGNATURE 


PR00405B 11.83 5.114e- 
16 281-299 PR00405A 
17.71 4.306e-14 262- 
282 


1428 


BL00039 


uciali-dox suDramiiy ATP - 
dependent helicases 
proteins . 


BL00039D 21.67 5.2l9e- 
34 147-193 


1429 


PR0O320 


G-PROTElN BETA WD-40 
REPEAT SIGNATURE 


PR00320C 13.01 8.920e- 
10 577-592 


1430 


PR00378 ! 


INOSITOL PHOSPHATASE 
SIGNATURE 


PR00378D 16.86 7.563e- 
12 295-314 PR00378B 
13.80 8.650e-10 166- 
186 


1431 


PR00928 


GRAVES DISEASE CARRIER 


PR00928B 13.53 3.769e- 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 






PROTEIN SIGNATURE 


10 103-124 


1433 


BL01113 


Clq domain proteins. 


BL01113B 18.26 7.049e- 
15 14-50 BL01113C 
13.18 7.000e-12 82-102 


1434 


PR00319 


BETA G- PROTEIN 

{ TRANSDUCIN) SIGNATURE 


PR00319B 11.47 7.983e- 
10 135-150 


1436 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030A 14.39 l.OOOe- 
12 84-103 


1438 


BL00290 


Immunoglobulins and 
major histocompatibility 
complex proteins . 


BL00290B 13.17 2 . 500e- 
09 250-268 BL00290A 
20.89 4.000e-09 188- 
211 


1440 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 38-52 


1441 


PR00806 


VINCULIN SIGNATURE 


PR00806B 4.28 4.960e- 
09 88-102 


1444 


BL00422 


Granins proteins . 


BL00422D 19.48 l.OOOe- 
08 114-138 


1445 


PD01841 


PHOSPHORYLASE KINASE 
ALPHA MUSCL. 


PD01841A 21.71 l.OOOe- 
40 73-123 PD01841B 
14.35 l.OOOe-40 144- ' 
185 PD01841D 17.87 
1.000e-40 206-258 
PD01841F 13.36 l.OOOe- 
40 296-345 PD01841G 
24.26 l.OOOe-40 349- 
403 PD01B41I 23.00 
l.OOOe-40 494-536 
PD01841J 14.94 l.OOOe- 
40 895-932 PD01841L 
18.42 l.OOOe-40 1083- 
1125 PD01841E 18.60 
9.719e-38 258-296 
PD01841K 14.81 l.OOOe- 
35 1041-1071 PDQ1841H 
21.30 3.189e-31 435- 
472 PD01841C 13.78 
1.000e-25 185-206 
PD01841M 10.82 1.250e- 
20 1175-1194 


1446 


PF00816 


H-NS histone ■ family . 


PF00816B 13.84 8 . 875e- 
09 190-220 


1447 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 2.080e- 
09 402-416 


1446 


DM00315 


072 RIBONUCLEASE 
INHIBITOR . 


DM00315D 18.40 7.393e- 
09 23-67 


1451 


BL00030 


Eukaryotic RNA-binding 
region rnp-1 proteins. 


BL00030B 7.03 2.800e- 
10 94-104 


1454 


DM01688 


2 POLY-IG RECEPTOR. 


DM01688D 13.44 7.14$e- 
09 382-405 


1455 


PP00 777 


Sialyl transferase 
family. 


PF00777C 18.60 2.929e- 
22 4-59 


1457 


BL00927 


Trehalase proteins. 


BL00927C 10.83 8.085e- 
09 42-53 


1460 


BL00545 


Aldose 1-epimerase 
proteins . 


BiO0S45C 11.28 7.353e- 
17 169-182 BL00545A 
10.20 2.071e-15 73-89 
BL00545B 13.10 3.942e- 
09 140-153 


1466 


PRO0097 


ANTHRANILATE SYNTHASE 
COMPONENT II SIGNATURE 


PR00097C 9.42 9.069e- 
09 233-245 


1472 


BL01129 


Hypothetical 
yabO/yceC/sfhB family 
proteins. 


BL01129E 13.25 5.250e- 
22 170-195 BL01129C 
25.56 9.526e-18 63-106 


1473 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2 . 821e- 
09 2114-2145 j 


1475 


PP00686 


Starch binding domain 
proteins . 


PF00686A 13.45 9.100e- 
09 267-277 
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SEQ ID NO: 


ACCESSION 
NO . 


DESCRIPTION 


RESULTS* 


1477 


PF00566 


Probable rabGAP domain 
proteins . 


PF00566A 12.64 7.333e- 
10 466-476 


1478 


BL00030 


Eukaryotic RNA-binding 
region RNP-1 proteins. 


BL00030B 7.03 9.400e- 
10 43-53 


1479 


DM00406 


«T Tl\r\TM 

GLIADIN . 


DM00406 7.73 8.541e-10 
292-305 


148 0 


BLO0290 


Immunoglobulins and 
major histocompatibility 
complex proteins . 


BL00290B 13.17 2.385e- 
15 69-87 BL00290A 
20.89 5.091e-ll 12-35 


1481 


PR00150 


PHOS PHOENOLP YRUVATE 
CARBOXYLASE SIGNATURE 


PR00150F 10.45 9.039e- 
09 21-51 


•j a ro 


PF00780 


Domain found in NIK1- 
like kinases, mouse 
citron and yeast ROM. 


PF00780I 14.69 4 . B25e- 
09 107-137 


1483 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 1 . 153e- 
09 108-162 


1485 


PD01066 


PROTEIflf ZINC FINGER 
ZINC-FINGER METAL- 
BINDING NU. 


PD0106* 19.43 5.909e- 
25 17-56 


1486 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107B 13.31 1.529e- 
09 34-50 


1488 


BL00039 


DEAD- box subfamily ATP- 
dependent helicases 
proteins . 


BL00039D 21.67 9.586e- 
10 116-162 | 


1490 


BL00166 


Enoyl - CoA 

hydratase/isomerase 
proteins . 


BL00166D 22.87 2.607e- 
24 190-226 BL00166C 
18.93 5.500e-14 140- 
167 BL00166B 16.92 
9.357e-ll 93-115 


1491 


BL00452 


Guanylate cyclases 
proteins . 


BL00452D 28.59 3.700e- 
31 63-106 BL00452E 
11.92 3.045e-13 115- 
131 


1492 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURB 


PR00019A 11.19 3.667e- 
09 532-546 


1497 


BL00107 


Protein kinases ATP- 
binding region proteins . 


BL00107B 13.31 l.OOOe- 
11 384-400 BL00I07A 
18.39 5.345e-ll 322- 
353 


1500 


PF00876 


Ogre family. 


PF00876E 7.99 1.947e- 
10 107-117 


1502 


BL00027 


■Homeobox' domain 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1503 


BL00027 


1 Homeobox 1 domain 
proteins . 


BL00027 26.43 4.789e- 
24 112-155 


1505 


BL01177 


Anaphylatoxin domain 
proteins . 


BL01177E 20.64 5.800e- 
24 448-475 BL01177C 
17.39 5.333e-19 402- 
421 BL01177B 13.61 
7.840e-16 155-171 
BL01177D 17.50 1.900e- 
15 427-445 


1506 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins . 


BL00972D 22.55 5.500e- 
14 311-336 BL00972A 
11.93 7.429e-14 48-66 
BL00972E 20.72 8.759e- 
10 341-363 


1512 


BL00523 


Sulfatases proteins. 


BL00523E 19.27 4.536e- 
22 76-106 BL00523D 
9.89 l,563e-ll 40-52 
BL00523F 10.85 4.162e- 
09 159-170 BL00523G 
9.46 5.333e-09 256-266 


1516 


BL00914 


Syntaxin / epimorphin 
family proteins. 


BL00914 24.91 7.045e- 
14 168-21B 


1518 


BL00600 


Aminotransferases class- 
Ill pyridoxal -phosphate 
attachment si. 


BL00600A 17.98 6.143e- 
19 98-122 BL00600E 
16.43 1.771e-17 302- 
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SEQ ID NO: 


ACCESSION 


DESCRIPTION 


RESULTS* 








331 BL00600G 12.43 
y.625e-17 377-396 
BL00600B 19.60 5.091e- 
15 160-186 BL00600C 
16.18 6.040e-12 190- 
206 BL00600F 8.77 
1.000e-ll 343-356 
BL00600D 8.71 1 . OOOe- 
10 281-295 


1523 


PD00930 


PROTEIN GTPASE DOMAIN 
ACTIVATION. 


PD00930B 33.72 9.600e- 
18 41-82 


1528 


PR00320 


G- PROTEIN BETA WD- 40 
REPEAT SIGNATURE 


PR00320B 12.19 4.774e- 
11 192-207 PR00320B 
±Z.x9 8.B39e-ll 272- 
287 PR00320B 12.19 
3. /4je-lU lUo-121 
PR00320A 16.74 l.B78e- 
09 192-207 PR00320A 

Xb . fH *.Jl/e-U3 106- 

121 PR00320A 16.74 
8.683e-09 272-287 
PR00320C 13.01 8.800e- 
09 106-121 


1538 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 4.508e- 
15 171-184 


1539 


PF00781 


Diacylglycerol kinase 
catalytic domain 
proteins (presumed) . 


PF00781D 11.11 7.593e- 
10 103-127 


1540 


PR00965 


OCULAR ALBINISM TYPE 1 
PROTEIN SIGNATURE 


PR00965H 10.73 1.231e- 
29 312-334 PR00965E 
12.93 5.846e-29 172- 
195 PR00965F 5.98 
1.123e-28 209-231 
PR00965C 15.04 l.OOOe- 
27 131-151 PR00965D 
5.84 1.000e-27 150-170 
PR00965G 8.52 2.440e- 
27 258-279 PR00965B 
4,80 8.650e-26 88-109 
PKD0965A 12.52 1 . 000e- 
25 35-55 PR00965I 
3.91 6.442e-25 385-406 


1541 


BL01013 


protein family proteins. 


ciiUXUlJU 2b. ol 9,719e- 
17 163-207 


1543 


PD02699 


PROTEIN DNA- BINDING 
BINDING DNA. 


PD02699C 24.84 l.OOOe- 
40 599-646 PD02699A 
8.91 2.286e-34 219-248 
PD02699B 18.28 6.143e- 
21 485-509 


1544 




WTT M 1 C TTTMfMT'D DDATCTM 

SIGNATURE 


PR00049D 0.00 7.857e- 
10 102-197 PR00049D 
0.00 7.102e-09 67-82 


1547 


BL00951 


ER lumen protein 
retaining receptor 
proteins . 


BL00951C 19.35 l.OOOe- 
40 93-142 BL00951D 
13.94 8.714e-40 142- 
177 BL00951A 15.10 
1.000e-38 2-38 . 
BL00951B 14.23 6.250e- 
33 38-69 


1548 


BL00536 


Ubiqui tin-activating 
enzyme proteins. 


BL00536F 13.65 8.920e- 
30 279-318 BL00536D 
22 .91 5.737e-24 21-65 
BL00536E 16.94 4.696e- 
18 248-279 


1549 


PR00139 


AS PARAGINASE/GLUTAMINASE 
FAMILY SIGNATURE 


PR00139C 11.72 9.679e- 
09 550-569 


1553 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.119e- 
09 58-73 
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SEQ ID NO: 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1556 


BL00061 


Short -chain 

dehydrogenases/reductase 
s family proteins. 


BL00061B 25.79 6.276e- 
13 67-105 


1557 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1558 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1559 


BL01228 


Hypothetical cof family 
proteins . 


BL01228D 17.44 8.105e- 
12 107-132 


1562 


BL00522 


DNA polymerase family X 
proteins . 


BL00522C 11.90 6.600e- 
18 412-436 BL00522B 
27.30 1.738e-16 364- 
410 BL00522A 25.52 
6.000e-16 279-326 
BL00522E 19.63 6.123e- 
14 502-532 BL00522F 
14.90 2.385e-13 551- 
575 


1563 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 1.947e- 
11 46-59 


1564 


BL00299 


Ubiquitin domain 
proteins . 


BL00299 28.84 2.823e~ 
10 324-376 


1566 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26 .81 8.594e- 
17 184-228 BL01013C 
9 . 97 4 . 906e-12 14-24 


1567 


BL00678 


Trp-Asp (WD) repeat: 
proteins proteins . 


BL00678 9.67 3.400e-10 
378-389 BL00678 9.67 
5.800e-10 418-429 
BL00678 9.67 8.800e-10 
295-306 


1570 


BL00479 


Phorbol esters / 
diacylglycerol binding 
domain proteins . 


BL00479B 12.57 5.235e- 
17 297-313 BL00479A 
19.86 6.625e~15 271- 
294 BL00479A 19.86 
2.667e-14 147-170 
BL00479B 12.57 6.294e- 
12 173-189 


1576 


PR006*5 


OXYTOCIN RECEPTOR 
SIGNATURE 


PR00665G 12.36 4.673e- 
24 364-384 PR00665D 
9.93 1.200e-22 138-155 
PR00665F 11.73 4.000e- 
22 337-354 PR00665C ' 
5.89 1.000e-20 65-80 
PR00665B 5.29 4.337e- 
19 24-39 PR00665E 
5.60 2.929e-15 246-260 
PR00665A 5.99 5.622e- 
15 11-25 


1577 


DM00099 


4 kw A55R REDUCTASE 
TERMINAL 

DIHYDROPTERIDINE . 


DM00099B 14.73 9.308e- 
10 127-137 


1579 


BL60524 


Somatomedin B domain 
proteins . 


BL00524A 9.65 6.776e- 
14 52-73 


1580 


PD02094 


HYDROLASE N4- PRECURSOR 
PROTEIN SIGNAL BE. 


PD02894B 13.93 6.959e- 
16 182-215 PD02894A 
21.96 2.125e-10 57-103 


1581 


BI.00411 | 


Kinesin motor domain 
proteins. 


BL00411C 15.04 5.292e- 
12 32-54 BL00411H 
15.66 4,44le-ll 245- 
276 


1582 


PR00604 


CLASS IA AND IB 
CYTOCHROME C SIGNATURE 


PR00604A 11.13 2.440e- 
09 79-87 


1584 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 l.OOOe- 
10 225-238 


1585 


DM015S1 


kw OSTEOINDUCTIVE YOPM 
MEMBRANE OUTER . 


DM01551C 14.62 9.455e- 
11 125-145 


1586 


DM01354 


kw TRANSCRIPTASE REVERSE 
II ORF2. 


DM01354S 11.61 7.750e- 
09 474-495 
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NO. 


DESCRIPTION 


RESULTS* 


1587 


PRO 00 72 


MALIC ENZYME SIGNATURE 


PR00072B 13.77 7,955e- 
33 180-210 PR00072A 
12.75 6.040e-25 120- 
145 PR00072C 11.42 
2.286e-24 216-239 
PR00072D 10.77 3.400e- 
22 276-295 PR00072E 
10.54 1.360e-19 301- 
318 PR00072G 10.45 
5.304e-19 433-450 
PR00072F 8.87 5.935e- 
15 332-349 


1589 


BL00191 


Cytochrome b5 family, 
heme -binding domain 
proteins . 


BL00191H 15.64 1.537e- 
22 61-113 BL00191K 
17.38 9.027e-12 398- 
442 


1590 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 7.716e- 
13 211-224 DM02970B 
8.60 2.157e-12 94-107 


1591 


DM00517 


5 kw NUCLEAR 6 0.7 NUP1 
CHROMOSOME . 


DM00517B 10.96 6.625e- 
16 1175-1193 DM00517A 
8 .21 1.000e-ll 1015- 
1026 


1592 


BL00037 


Myb DNA-binding domain 
proteins repeat proteins 
proteins . 


BL00037B 15.92 3.250e- 
27 116-142 BL00037A 
16.68 2.500e-24 83-107 
BL00037A 16.68 3.250e- 
12 31-55 BL00037B 
15.92 3.526e-ll 64-90 
BL00037C 16.86 9.654e- 
10 146-164 


1595 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 1.514e- 
09 110-127 


1598 


PF00628 


PHD- finger. 


PF00628 15.84 3.250e- 
11 1667-1682 


1599 


PR00014 


FIBRONECTIN TYPE III 
REPEAT SIGNATURE 


PR00014D 12.04 5.500e- 
09 980-995 


1*00 


BL00S18 


Zinc finger, C3HC4 type 
(RING finger), proteins. 


BL00518 12.23 £.571e- 
10 30-39 


1602 


BL00412 


Neuromodulin (GAP-4 3) 
proteins . 


BL00412D 16.54 5.402e- 
10 136-187 


1605 


PF00651 


BTB (also known as BR- 
C/Ttk) domain proteins. 


PF00651 15.00 3.571e- 
10 44-57 


"1607 


BL00252 


Interferon alpha, beta 
and delta family 
proteins. 


BL00252A 18.49 6.657e- 
23 20-57 BL00252B 
19.78 9.l25e-16 58-109 


1610 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 l.OOOe- 
08 61-94 


1611 


BL00904 


Protein 

prenyl transferases alpha 
subunit repeat proteins 
proteins . 


BL00904C 8.98 7.353e- 
10 91-125 BL00904D 
1.47 6.018e-09 127-168 


1612 


PF00168- 


C2 domain proteins . 


PF00168C 27.49 3.250e- 
09 365-391 


1613 • 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 6.051e- 
09 932-983 BL00412D 
16.54 7.153e-09 933- 
984 


1614 


BL00559 


Eukaryotic molybdopterin 
oxidoreductases 
proteins . 


BL00559I 13.63 3.53le- 
25 54-83 BL00559K 
13 .17 2.957e-18 197- 
224 BL00559J 19.63 
6.870e-16 124-176 
BL00559L 13.60 9.000e- 
16 266-284 


1615 


PD01427 


TRANSFERASE 
METHYLTRANS FERASE BI . 


PD01427B 22.45 3.025e- 
22 500-541 PD01427A 
19.94 8.773e-18 439- 
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ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 








472 


1616 


BL00115 


Eukaryotic RNA 
polymerase II 
heptapeptide repeat 
proteins . 


BL00115Z 3.12 7.4B5e- 
09 152-201 BL00115Z 
3.12 9.603e-09 145-194 


1617 


BL00303 


S-100/ICaBP type calcium 
binding protein. 


BL00303B 26.15 7.750e- 
32 51-88 BL00303A 
21.77 1.947e-31 4-41 


1618 


BL01254 


Fetuin family proteins. 


BL01254F 10.02 8 . 754e- 
09 137-147 


1619 


PD01888 


PEPTIDE REDUCTASE 
PROTEIN METHI . 


PD01888B 25.10 l.OOOe- 
40 47-97 PD01888C 
21.56 7.000e-30 125- 
155 PD01888A 12.84 
8.800e-15 7-23 


1621 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.455e- 
09 692-704 PR00239E 
1.58 4.580e-09 697-709 
PR00239E 1.58 4.580e- 
09 702-714 PR00239E 
1.58 5.193e~09 703-715 


1622 


PR00860 


VERTEBRATE 

MBTAliLOTHIONEIN 

SIGNATURE 


PR00860B 7.04 1.900e- 
18 27-41 PR00860C 
9.61 1.474e-14 41-51 
PR00860A 5.46 1.720a- 
14 5-18 


1624 


PR00784 


MITOCHONDRIAL BROWN FAT 
UNCOUPLING PROTEIN 
SIGNATURE 


PR00784D 15.86 8.027e- 
11 77-95 


1626 


BL00325 


Actin-depolymerizing 
proteins. 


BL00325B 21.66 l.OOOe- 
40 93-139 BL00325A 
24.83 6,786e-23 61-93 


1631 


BL00064 


L-lactate dehydrogenase 
proteins . 


BL00064B 23.57 1 . OOOe- 
40 82-130 BL00064C j 
17.28 1.000e-40 137- 
182 BL00064E 27.20 
1.000e-40 223-275 
BL00064F 25.14 7.882e- 
36 286-331 BL00064A 
21.16 1.000e-33 22-60 
BL00064D 14.19 6.500e- 
31 182-212 


1632 


PR00063 


RIBOSOMAL PROTEIN L27 
SIGNATURE 


PR00063B 15.24 9.700e- 
11 59-84 PR00063A 
11.71 1.614e-09 34-59 


1634 


PRO 0 23 9 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239D 0.00 l.lOSe- 
11 36-49 PR00239C 
3.51 2.538e-09 37-45 


1636 


BL01210 


Caveolins proteins. 


BL01210B 13.92 9.531e- 
10 133-183 


1637 


BL00982 


Bacterial- type phytoene 
dehydrogenase proteins. 


BL00982A 18.41 5.388e- 
11 11-43 


1639 


BL01183 


ubiE/COQ5 

methyltransferase family 
proteins. 


BL01183B 21.31 8.144e- 
12 132-177 


1640 


PR00015 


GRAM- POSITIVE COCCUS 
SURFACE PROTEIN ANCHOR 
SIGNATURE 


PR00015B 9.84 8.468e- 
10 128-149 


1641 


PR00320 


G- PROTEIN BETA WD- 4 0 
REPEAT SIGNATURE 


PR00320B 12.19 5.935e- 
11 364-379 PR00320A 
16.74 7.828e-ll 364- 
379 PR00320C 13.01 
2.800e-10 279-294 
PR00320C 13.01 2.800e- 
10 364-379 PR00320B 
12.19 5.114e-10 279- 
294 PR00320A 16.74 
1.659e-09 279-294 
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NO . 


DESCRIPTION 


RESULTS* 








PR00320A 16.74 2.098e- 
09 229-244 


1642 


PP00023 


Ank repeat proteins. 


PF00023A 16.03 6.464e 7 
09 114-130 


1643 


PR00169 


POTASSIUM CHANNEL 
SIGNATURE 


PR00169A 16.77 1.806e- 
11 74-94 


1644 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 2.200e-10 
109-120 BL00678 9.67 
5.737e-09 528-539 


1645 


BL01108 


Ribosomal protein L24 
proteins . 


BL01108A 20.33 7.366e- 
17 56-B9 


1644 


PR00380 


KINESIN HEAVY CHAIN 
SIGNATURE 


PR00380A 14.18 9.270e- 
21 103-125 PR00380D 
9.93 6.30Be-18 386-408 
PR00380C 13.18 7.923e- 
16 332-351 PR00380B 
12.64 6.657e-15 292- 
310 


1*47 


DM01242 


3 THREONINE- -TRNA 
LIGASE . 


DM01242C 17.15 9.791e- 
37 340-381 DM01242E 
23.00 5.071e-31 463- 
505 DM01242D 23.29 
3.925e-30 420-463 
DM01242B 23.57 8 . 054e- 
18 265-314 DM01242F 
10.61 7.618e-14 526- 
540 




PD00126 


PROTEIN REPEAT DOMAIN 
TPR NUCLEA. 


PD00126A 22.53 5.500e- 
10 13-34 


lb bl 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19,54 6.720e- 
11 431-485 


i/co 

lbb2 


n T A A ft *1 1 

BL00933 


FGGY family of 
carbohydrate kinases 
proteins . 


BL00933A 17.50 4.673e- 
12 11-35 BL00933E 
13.80 9.217e-09 456- 
472 


1653 


BL00795 


Involucrin proteins. 


BL00795C 17.06 2 . 988e~ 
10 70-115 


1654 


BL00982 


Bacterial -type phytoene 
dehydrogenase proteins . 


BL00982A 18.41 7.750e- 
17 302-334 


1655 


BL00982 


Bacterial -type phytoene 
dehydrogenase proteins , 


BL00982A 18.41 7.750e- 
17 282-314 


1656 


BL00741 


Guanine - nucl eot i de 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 1.391e- 
16 607-630 


i C C n 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 7.93Be- 
11 114-136 


1658 


PR00910 


LUTEOVIRUS ORF6 PROTEIN 
SIGNATURE 


PR00910A 2.51 8.889e- 
10 442-455 


1659 


BL00972 


Ubiquitin carboxyl- 
terminal hydrolases 
family 2 proteins . 


BL00972D 22.55 4.140e- 
12 376-401 BL00972E 
20.72 5.629e-09 446- 
468 


1660 


BL00406 


Actins proteins. 


BL00406D 12.58 8.767e- 
15 188-243 


1*61 


PR00105 


CYTOSINE-SPECIFIC DNA 

METHYLTRANSFERASE 

SIGNATURE 


PR00105A 10.36 4.900e- 
13 1140-1157 PR00105B 
12.32 2.800e-12 1259- 
1274 PR00105C 10.86 
1.000e-10 1305-1319 


1662 


BI.,00280 


Pancreatic trypsin 
inhibitor (Kunitz) 
family proteins. 


BL00280 24.61 3.172e- 
33 3119-3163 


1663 


PR00319 


BETA G- PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00319D 11.64 6.625e- 
23 107-125 PR00319C 
13.41 5.714e-20 89-105 
PR00319A 15.27 5.286e- 
19 51-68 PR00319B 
11.47 8.200e-19 70-85 
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SEQ ID NO:" 


ACCESSION 
NO. 


DESCRIPTION 


RESULTS* 


1664 


BL00018 


EF-hand calcium-binding 
domain proteins. 


BL00018 7.41 5.050e-10 
489-502 


1667 


PD01066 


PROTEIN ZINC FINGER 
ZINC- FINGER METAL- 
BINDING NU. 


PD01066 19.43 8.500e- 
38 7-46 


1669 


BL01153 


NOLl/NOP2/sun family 
proteins . 


BL01153D 19.69 1.188e- 
17 115-141 BL01153C 
13 .67 8.977e-15 66-80 
BL01153B 20.52 1.885e- 
10 13-37 


1671 


PR00678 


PI3 KINASE P85 
REGULATORY SUBUNIT 
SIGNATURE 


PR00678H 9.13 3.100e- 
10 1146-1169 


1672 


BL00598 


Chromo domain proteins . 


BL00598 14.45 8.500e- ' 
20 27-49 


1673 


PR00326 


GTPl/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 8.329e- " 
09 686-707 


1674 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.580e- 
11 343-3S8 PR00049D 
0.00 1.286e-10 342-357 


1676 


PR00747 


GLYCOSYL HYDROLASE 
FAMILY 4 7 SIGNATURE 


PR0074 7H 12.76 8.636e- 
19 427-448 PR00747G 
14.50 2.286e-18 368- 
393 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747D 
15.23 8.759e-17 163- 
183 PR00747E 15.13 
8.244e-15 254-272 
PR0O747B 7.65 5.355e- 
13 75-90 PR00747F 
13.56 8.714e-10 311- 
328 


1677 


PR00747 


GLYCOSYL HYDROLASE 
FAMILY 47 SIGNATURE 


PR00747H 12.76 8.636e- 
19 309-330 PR00747G 
14.50 2.286e-18 250- 
275 PR00747C 12.06 
7.500e-18 112-131 
PR00747A 14.05 4.600e- 
17 42-63 PR00747B 
7.65 5.355e-13 75-90 
PR00747F 13.56 8.714e- 
10 193-210 


1680 


BL00678 


Trp-Asp (WD) repeat 
proteins proteins. 


BL00678 9.67 4.600e-10 
406-417 BL00678 9.67 
6.684e-09 320-331 


loo J. 


BL00678 


Trp-Asp <WD) repeat 
proteins proteins. 


BL0067B 9.67 4.600e-10 " 
329-340 BL00678 9.67 
6.684e-09 243-254 


1683 


PR00326 


GTPl/OBG GTP -BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 1.346e- 
13 389-410 


1685 


PR00646 


RDC1 ORPHAN RECEPTOR 
SIGNATURE 


PR00^4^H 6.32 4.188e- 
09 755-771 


1690 


BL01160 


Kinesin light chain 
repeat proteins. 


BL01160B 19.54 6.644e- ™ 
09 75-129 


1691 


PR00456 


RIBOSOMAL PROTEIN" P2 j 
SIGNATURE 


PR00456E 3.06 7.281e- 
10 418-433 PR00456E 
3.06 7.281e-10 419-434 
PR00456E 3 .06 8 . 125e- 
10 420-435 


1692 


PR0045<i * 


RIBOSOMAL PROTEIN P2 
SIGNATURE 


PR00456E 3.06 7.28le- 
10 487-502 PR00456E 
3.06 7.281e-10 488-503 
PR00456E 3 .06 8.125e- 
10 489-504 


1693 


BL00674 


AAA-protexn family 
proteins . 


BL00674C 22.60 8.043e- 
24 274-317 BL00674B 
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ACCESSION 
NO. 


DESCRIPTION 










4.46 4.000e-23 241-2^3 
BL00674D 23 .41 8.560e- 
18 338-385 BL00674E 
1 c *>A 1 Tina ic a 1 a 

434 


1697 


PR00409 


PHTHALATE DIOXYGENASE 
REDUCTASE FAMILY 
SIGNATURE 


PR00409F 12.70 4.388e- 
10 427-447 


1698 


PR00466 


CYTOCHROME B-24 5 HEAVY 
CHAIN SIGNATURE 


PR00466C 10.17 3.443e- 
13 187-208 PR00466B 
5.03 5.500e-ll 162-186 
PR00466F 9.16 6.159e- 


1699 


BL00028 


Zinc finger, C2H2 type, 
domain proteins. 


BL00028 16.07 9.2l7e- 
12 283-300 BL00028 
16.07 3.769e-ll 255- 
272 BL00028 16.07 
5.154e-ll 171-188 
BL00028 16.07 5.500e- 
11 227-244 BLD0028 
16.07 1.600e-10 199- 
216 


1700 


BL01019 


nur-riuofayidtion IaCCOiS 

family proteins . 


BL01019A 13.20 3.348e- 
15 62-102 BL01019B j 
19.49 4.000e-15 107- 
162 


1703 


PD01066 


PROTEIN ZINC FINGER 

^TMP- E'TWfJFD MTTT7VT 

BINDING NU. 


PD01066 19.43 2.484e- 
12 200-239 


1707 


PR00109 


TYROSINE KINASE 
CATALYTIC DOMAIN 

Q T j^MRTr TP T7 


PR00109B 12.27 4.558e- 
14 134-153 


1710 


PR00019 


LEUCINE-RICH REPEAT 

Q T (~i NT B T"T TO 17 


PR00019A 11.19 2.565e- 
10 116-130 PR00019B 
11.36 4.600e-09 113- 
127 PR00019B 11.36 
7.120e-09 204-218 


1711 


BL01159 


vtv% / A Hps / YiYi if U \J lucL X n 

proteins . 


BL01159 13.85 6. 52 Se- 
ll 232-247 BL01159 

628 


1712 


PF00023 


Ank repeat proteins . 


PF00023A 16.03 7.000e- 

J. V/ JLO / "£UJ 


1713 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3 -H tvoe (and similar ) 


PF00642 11.59 9.550e- 


1714 


PF00642 


Zinc finger C-x8-C-x5-C- 
x3-H type (and similar) . 


PF00642 11.59 9.550e- 
11 230-241 


1715 


BL01115 


GTP-bindincr nurlpar 
protein ran proteins. 


P.T.ni 1 1 CA 1 n 01 h i n"n „ 
bbUlilSA 1 . I29e- 

09 7-51 


1718 


BL00353 


HMG1/2 oroteina 


tJijUUJDoL 14.83 6 . 018e- 
10 136-183 BL00353B 
11.47 8.866e-09 86-136 


1719 


BL00412 


Neuromodulin (GAP-43) 
proteins . 


BL00412D 16.54 5.408e- 

no A 1 O A Q1 


1721 


BL0003 8 


Myc-type, 'helix- loop- 
helix' dimerization 
domain proteins . 


BL00038B 16.97 8.44Be- 
12 79-100 BL00038A 
13.61 4.000e-ll 52-68 


1723 


PD00567 


PROTBIN RNA- BINDING RNA 
REPEAT HYD. 


PD00567C 9.17 8.500e- 
09 418-42B 


1724 


BL01279 


Protein-L- 
isoaspartate {D- 
aspartate) 0- 
methyltransf erase signa. 


BL01279A 24.27 5.663e- 
12 233-281 


1728 


BL00018 


EF-hand calcium-binding 
domain proteins . 


BL00018 7.41 2.059e-ll 
73-86 ' BL00018 7.41 
4.176e-ll 157-170 


1730 


BL00594 


Aromatic amino acids 
permeases proteins. 


BL00594A 16.75 1.089e- 
09 17-61 
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NO. 


DESCRIPTION 


RESULTS* 1 


1731 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 9.676e- 
10 296-350 


1732 


BL01160 


Kinesin light chain 
repeat proteins . 


BL01160B 19.54 9.676e- 
10 316-370 


1733 


PF00850 


Histone deacetylase 
family. 


PF00850F 15.70 4.349e- 
22 246-279 PFD0850D 
14.76 6.850e-20 177- 
201 PF00850E B.88 
8.691e-18 209-235 
PF00850G 22.75 4.098e- 
14 281-323 


1734 


BL00354 


HMG-I and HMG-Y DNA- 
binding domain proteins 
(Ahook) . 


BL00354C 6.61 5.932e- 
09 292-307 


1735 


DM00179 


w KINASE ALPHA ADHESION 
T-CELL. 


DM00179 13.97 5.263e- 
10 492-502 


1743 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- 
11 5-27 PR00449D 
10.79 2.241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-1S7 


1744 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449A 13.20 1.188e- . 
11 5-27 PR00449D 
10.79 2 .241e-10 109- 
123 PR00449E 13.50 
9.289e-10 144-157 


1745 


BL00720 


Guanine -nucleotide 
dissociation stimulators 
CDC25 family sign. 


BL00720B 16.57 B.297e- 
15 136-160 


1746 


PR00081 


GLUCOSE/RIBITOL 
DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081B 10.38 6.727e- 
11 45-57 PR00081E 
17.54 3.935e-10 150- 
168 


1747 


BL00439 


Acy 1 1 ran s f e r a se s 
ChoActase / COT / CPT 
family proteins. 


BL00439H 18.24 8.435e- 
14 65-91 BL00439G 
13.40 2.895e-12 3-14 


1749 


PR00819 


CBXX/CFQX SUPERFAMILY 
SIGNATURE 


PR00819B 10.83 7.158e- 
11 4-20 


1751 


PD00066 


PROTEIN ZINC- FINGER 
METAL- BINDI . 


PD00066 13.92 3.400e- 
14 33-46 PD00066 
13.92 1.000e-13 89-102 
PD00066 13.92 7.000e- 
13 61-74 PD00066 
13 .92 6.571e-12 117- 
130 


1753 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 6 . 516e- 
18 33-77 


1754 


BL00790 


Receptor tyrosine kinase 
class V proteins. 


BL00790I 20.01 2.393e- 
09 490-521 BL00790I 
20.01 2.821e-09 60-91 
BL00790I 20.01 6.357e- 
09 2B7-318 


1756 


PD01066 


PROTEIN ZINC FINGER j 
ZINC- FINGER METAL- 
BINDING NU. 


t>D01066 19.43 9.750e- "' 
35 10-49 


1758 


DM00406 


GLIADIN. 


DM00406 7.73 7.600e-09 
653-666 


1762 


PD02929 


ADHESION GLYCOPROTEIN 
PRECURSOR I. 


PD02929A 28.27 4.529e- 
09 224-278 


1765 


PR00326 


GTP1/OBG GTP-BINDING 
PROTEIN FAMILY SIGNATURE 


PR00326A 8.75 5.950e- 
11 146-167 


1775 


PF00023 


Ank repeat proteins . 


PF00023A 16.03 3.077e- 
14 523-539 


1776 


BL00942 


glpT family of 
transporters proteins. 


BL00942F 15.07 4.343e- 
10 371-389 BL00942B 
20.36 8.040e-09 94-137 


1777 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e- 
09 279-312 



244 



WO 01/53312 PCT/US00/34263 



SEQ ID NO: 


ACCESSION 
NO. 


"DEiSCRIPTlON 


RESULTS* 


1778 


BL00084 


Copper type II, 
ascorbate- dependent 
monooxygenases proteins . < 


BL00084D 25.11 3.700e- 
20 169-224 BL00084B 
24 26 8 134e-16 10-58 
BL00084C 27.71 8.412e- 
11 107-158 


1779 


BL01013 


Oxysterol -binding 
protein family proteins. 


BL01013D 26.81 3.758e- 
18 611-655 BL01013A 
25.14 2.881e-l5 344- 
380 BL01013C 9.97 
6.308e-13 435-445 
BL01013B 11.33 3.717e- 
12 409-420 


1783 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC 2 4 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 


17B4 


BL00741 


Guanine-nucleotide 
dissociation stimulators 
CDC24 family sign. 


BL00741B 14.27 8.138e- 
13 492-515 



* results include in order: accession number subtype; raw score; p- value; post ion of 
signature in amino acid sequence. 
TRADOCS: 1416223.1 (%CRJ0 1 ! . DOC) 
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TABLE 4 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p- value 


PFAM 
SCORE 


2 




Immunoglobulin domain 


2 . le-32 


109 . 5 


3 


pkinase 


Eukaryotic protein kinase 
domain 


1.3e-29 


110 .7 


4 


zf-C2H2 


Zinc finger, C2H2 type 


1 . 6e-21 


84 .9 


5 


£n3 


Fibronectin type III domain 


0 


1097 . 1 


6 


fn3 


Fibronectin type III domain 


0 


1035.0 


7 


fn3 


Fibronectin type III domain 


0 


1090 . 4 


8 


fn3 


Fibronectin type III domain 


0 


1097.1 


9 


TBC 


TBC domain 


4e-40 


146.7 


10 


p450 


Cytochrome P4 50 


9.5e-17 


62 .0 


12 


ank 


Ank repeat 


6e-20 




14 


ig 


Immunoglobulin domain 


1.7e-05 


22.7 


15 


zf -MYND 


MYND f "i nci(=»T 


X . Jc-Uo 


5 b . 4 


16 


zf -MYND 


MYND f -i ncTP r~ 


X . je-Ub 


35.4 


17 


zf -C2H2 


cj x i n_ i my cl , L^nz type 


1 . 7e-99 


343.9 


18 


CAPJ3LY 


CAP-Gly domain 


1.2e-25 


98 .7 


20 




TMD HoViifHv^AfTonn / M 7"} 

.u»if uenytn: ogenase / Lrrof 
reductase C terminus 


1 . 6e-119 


410.5 


21 


TMPDH C 
h i run v_ 


IMP dehydrogenase / GMP 
x cuuu t_a \— tcrmiriuo 


4 . 3e-102 


3 52.6 


22 


okinase 


d u/va I. yULJ.t JJIULC Xli /v J. I id. a 
Horrid i n 


O /Id 1 Q 

z . se- / y 


- 

277 . 0 


23 


pkinase 


Eukaryotic protein kinase 


8.4e-74 


258.6 


25 


RNA doI A 


«wn puj.yiucx.cLoe axpna suDumc 


I) 


1077.7 


26 


Clq 


v_.xlj QOiucixn 


1 . 9e-10 


44 . 4 


27 


*\ A JJvJ o Udlct J. Lie, 

3 


KiDosomax procem uzs 


7 . 8e-32 


111 .2 


28 


Ribosomal h2 
3 






104.2 


30 


zf -A20 


A20-like zinc finger 


1 5e- 10 


A Q C 


31 


zf-A20 


A20-like zinc finger 


1.5e-10 


48.5 


32 


FMN dh 






608.1 


34 


PID 


Phosphotyrosine interaction 
domain (PTB/PID) 


3.8e-59 


209.9 


35 


ig 


Immunoglobulin domain 


1.4e-13 


48.8 


36 


ig 


Immunoglobulin domain 


1 4e-13 


ah n 
to . o 


40 


kinesin 


Kinesin motor domain 


6 7e-76* 


zoo . b 


44 


Ets 


Ets -domain 


1 4e- 56 


106 . X 


45 


Ets 


Ets -domain 


1.4e-56 


182.1 


46 


LRR 


ucuLiiic rvxun rcepedu 


1 . 7e-13 


58 . 3 


48 


zf -C2H2 


/jxiil. xxjiytsx # v^^fiz uype 


/ . Je- xoz 


552 . 8 


49 


IT AM 


luiiuuiiutcLcpLCJI tyiOSlllB - JjaBcQ 

activation mot 


1 . 4e- 05 


31.9 


50 


UCH-2 


tlh i nil 1 t". l n carhnxvl -t*prm"ina1 

hydrolase family 




i no n 
1UZ . u 


51 


UCH-2 


Ubiquitin carboxyl - terminal 
hydrolase family i 


l.le-26 


102.0 


52 


ras 


Ras family 


□ . DC 3 


lot • J 


53 


PRK 


Phosphor ibulokinase 


& . 1c O j 


z j U . / 


54 


myb DNA- 
binding 


Mvh-liJcR DNA -hi nH i nn rinma i n 
1 1 X *- / *xivv jjxiiuj.ii^ uuiiici xzx 




1j . ^ 


55 


voltage_CLC 


Voltage gated chloride channels 


3.3e-186 


631.9 


56 


sugar_tr 


Sugar (and other) transporter 


0.00015 


-64.3 


57 


TBC 


TBC domain 


2.2e-37 


137.6 


58 


ank 


Ank repeat 


5.9e-25 


96". 3 


59 


ank 


Ank repeat 


5.9e-25 


96 .3 


67 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


7.9e-49 


175.6 


68 


C2 


C2 domain 


7 .9e-54 


192 .2 


69 


C2 


C2 domain 


2.3e-54 


194 .0 


70 ■ 


Kelch 


kelch motif 


9.4e~99 


341.5 


72 


ig 


Immunoglobulin domain 


8 .2e-28 


94.7 


73 


pkinase 


Eukaryotic protein kinase 


8e-69 


242.1 



246 



WO 01/53312 



PCT/US00/34263 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 






domain 






74 

-~n~? 


pkinase 


Eukaryotic protein kinase 
domain 


2.8e-38 


140.6 


76 


zf - 

C4_Topoisom 


Topoisomerase DNA binding C4 
zinc fing 


5.4e-54 


192 .8 


03 


Peptidase_S9 


Prolyl oligopeptidase family 


4.3e-l0 


36.8 


84 


fn3 


Fibronectin type III domain 


4 .le-51 


183.2 


QC 
OD 


SH2 


Src homology domain 2 


3.1e-22 


67.7 


88 


!g 


Immunoglobulin domain 


0.0091 


14 . 0 


69 


WD4 0 


WD domain, G-beta repeat 


2.1e-21 


84.6 


92 


lamininjG 


Laminin G domain 


6.1e-27 


98.5 


93 


AMP-binding 


AMP-binding enzyme 


2.4e-13 


-37.2 


95 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-59 


211.4 


96 


pkinase 


Eukaryotic protein kinase 
domain 


2.6e-51 


183 .9 


97 


adh_short 


short chain dehydrogenase 


2e-6l 


217.5 


98 


kinesin 


Kinesin motor domain 


2.2e-86 


300.4 


101 


IRS 


PTB domain (IRS-1 type) 


5.4e-36 


133 .0 


102 


AAA 


ATPases associated with various 
cellular act 


6 .8e-05 


-5.2 


104 


pkinase 


Eukaryotac protein kinase 
domain 


2.7e-73 


256.9 


106 


ras 


Ras family 


8.3e-24 


92.5 


107 


FYVE 


FYVE zinc finger 


5.4e-27 


100.7 


108 


Cyt_reductas 
e 


FAD/NAD- binding Cytochrome 
reductase 


7.7e-61 


215.5 


109 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-122 


420.0 


113 


pkinase 


Eukaryotic protein kinase 
domain 


4e-88 


306.2 


116 


PH 


PH domain 


3 .le-11 


45.2 


117 


lipocalin 


Lipocalin / cytosolic fatty- 
acid binding pr 


2.4e-14 


53.5 


118 


pkinase 


Eukaryotic protein kinase 
domain 


4.5e-20 


76.3 


120 


WD4 0 


WD domain, G-beta repeat 


2 .4e-14 


61.1 


121 


WD4 0 


WD domain, G-beta repeat 


2 .4e-14 


61.1 


123 


IF5_eIF4_eIF 
2 


eIF4-gamma/elF5/eIF2-epsilon 


le-32 


122,2 


124 


ig 


Immunoglobulin domain 


6 .5e-08 


30.6 - 


127 


mito_carr 


Mitochondrial carrier proteins 


3e-16 


58.6 


128 


PP2C 


Protein phosphatase 2C 


2 .2e-71 


250.6 


129 


ATP1G1_PLM_M 
ATS 


ATP1G1/PLM/MAT8 family 


3 ,le-20 


80.6 


130 


pfkB 


pfkB family carbohydrate kinase 


4 ,5e-42 


137.1 


133 


ACBP " 


Acyl CoA binding protein 


4.6e-22 


86.7 


134 


rrm 


RNA recognition motif. 


1.2e-31 


118.5 


135 


IQ 


IQ calmodulin-binding motif 


2.6e-08 


41.0 


136 


ATP1G1_PLM_M 
AT8 


ATP1G1/PLM/MAT8 family 


9.3e-22 


85.7 


139 


WH2 


Wiskott Aldrich syndrome 
homology region 2 


0.0067 


23.1 


140 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-82 


287.5 


141 


Peptidase 32 
6 


Signal peptidase I 


5.7e-10 


35.7 


143 


arf 


ADP-nbosylation factor family 


1.2e-39 


145.2 


146 


KRAB 


KRAB box 


7.3e-30 


112 .6 




DUF6 


Integral membrane protein DUF6 


0.096 


B.O 


149 


PDEase 


3' 5' -cyclic nucleotide 
phosphodiesterase 


3.8e-80 


231 .1 


151 


S4 


S4 domain 


l.le-08 


42.3 


15j 


tRNA-synt_ld 


tRNA synthetases class I (R) 


3.8e-103 


356.1 


154 


Cyt_reductas 
e 


FAD/NAD -binding Cytochrome 
reductase 


7.8e-60 


212.2 


155 


ras j 


Ras family 


3.6e-28 


107.0 


157 


actin 


Actin 


3.8e-26 


87.1 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


158 


Jacalin 


Jacalin-like lectin domain 


0.09 


-24.9 i 


160 


Zn_carbopept 


Zinc carboxypeptidase 


5e-138 


471.9 


165 


plcinase 


Eukaryotic protein kinase 
domain 


5.1e-67 


236 .1 


• 167 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-07 


27.0 


168 


Ribosomal_Sl 
5 


Ribosomal protein S15 


l.le-06 


29.0 


U9 


DEAD 


DEAD/DEAH box helicase 


le-48 


157.0 


171 


DUF59 


Domain of unknown function 
DUF59 


0 .07 


-17.4 


172 


pkinase 


Eukaryotic protein kinase 
domain 


3 .7e-15 


58 .6 


173 


globin 


Globin 


4.6e-18 


67.4 


174 


WW 


WW domain 


7.3e-06 


32 .9 


175 


ras 


Ras family 


le-31 


118.8 


178 


ATP1G1_PLM_M 
AT 8 


ATP1G1/PLM/MAT8 family 


2.5e-17 


71.0 


179 


2f-C2H2 


Zinc finger, C2H2 type 


1.5e-99 


344.2 


180 


Clq 


Clq domain 


8.8e-72 


251 .9 


190 


Yjphosphatas 
e 


Protein- tyrosine phosphatase 


4.9e-287 


967.0 


191 


efhand 


EF hand 


7.5e-16 


66.1 


193 


pkinase 


Eukaryotic protein kinase 
domain 


6.5e-82 


285 .6 


194 


bromodomain 


Bromodomain 


5.8e-31 


111 .4 


195 


PALP 


Pyridoxal -phosphate dependent 
enzyme 


2.5e-64 


227.1 


197 


DnaJ 


DnaJ domain 


1.6e-38 


141 .4 


199 


RrnaAD 


Ribosomal RNA adenine 
dimethylases 


0.00018 


16.9 


200 


acid_phospha 
t 


Histidine acid phosphatase 


2 .5e-10 


37.2 


201 


WH2 


Wiskott Aldrich syndrome 
homology region 2 


0.00048 


26.9 


204 


vATP- 
synt_AC39 


ATP synthase (C/AC39) subunit 


1.3e-159 


543 .7 


205 


vATP- 
synt_AC3 9 


ATP synthase (C/AC3 9) subunit 


1.6e-139 


476 .9 


206 


ldl_recept_a 


Low-density lipoprotein 
receptor domain 


2.4e-25 


97.6 


209 


ank 


Ank repeat 


1.4e-19 


78.4 


210 


Rhomboid 


Rhomboid family 


0.0035 


1.2 


211 


Clq 


Clq domain 


1.6e-70 


247.7 


212 


UQ_con 


Ubiqui tin- conjugating enzyme 


7.4e-74 


258.8 


213 


UQ_con 


Ubiqui tin -conjugating enzyme 


le-53 


191.9 


215 


DEAD 


DEAD/DEAH box helicase 


1.8e-43 


140.4 


216 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/Claudin family 


4.5B-21 


83.4 


21B 


Glycos_J:rans 
f_2 


Glycosyl transferases 


4e-21 


83 .6 


219 


ig 


Immunoglobulin domain 


0. 092 


10.7 


222 


WD4 0 


WD domain, G-beta repeat 


7.4e-23 


89.4 


224 


TPR 


TPR Domain 


1.2e-08 


42.1 


225 


DnaJ_CXXCXGX 
G 


DnaJ central domain (4 repeats) 


1.5e-38 


141.5 


226 


DnaJ_cxxcxGX 

G 


DnaJ central domain (4 repeats) 


1.5e-38 


141.5 


229 


HSP70 


Hsp70 protein 


2.4e-54 


194.0 


230 


GSHPx 


Glutathione peroxidases 


3 ,4e-47 


170.2 


231 


tsp_l 


Thrombospondin type 1 domain 


0.0075 


17.1 


233 


cyclin 


Cyclin 


4.6e-144 


492.0 


234 


ras 


Ras family 


4.8e-50 


179. 7 


235 


LRR 


Leucine Rich Repeat 


1.2e-30 


115.3 


236 


LRR 


Leucine Rich Repeat 


6 .7e-29 


109.4 


237 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.7e-09 


45.0 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


244 


dCMP_cyt_dea 
m 


Cytidine and deoxycyt idylate 
deaminase 


2.5e-0* 


31.1 


245 


ig 


Immunoglobulin domain 


6.7e-08 


30.5 


248 


writ 


wnt family of developmental 
signaling protei 


9 . le-270 


742.6 


250 


mito_carr 


Mitochondrial carrier proteins 


1.3e-55 


193.6 


254 


adenylatekin 
ase 


Adenylate kinase 


1.8e-14 


55.7 


255 


Cation_ef flu 

X 


Cation efflux family 


2 . 8e-33 


124.0 


256 


SH3 


SH3 domain 


3 .9e-14 


60.4 


257 


Aa_trans 


Transmembrane amino acid 
transporter protein 


2.6e-52 


187.2 


258 


adenylatekin 
ase 


Adenylate kinase 


2 . le-110 


380.2 


259 


HIT 


HIT family 


8.2e-07 


25.3 


260 


Bacterial PQ 
Q 


PQQ enzyme repeat 


1.6e-15 


65.0 


262 


proteasome 


Proteasome A- type and B-type 


6.5e-64 


225.7 


267 


pkinase 


Eukaryotic protein kinase 
domain 


6 .3e-27 


101.0 


270 


filament 


Intermediate filament proteins 


3 .2e-150 


512 .5 


271 


Choline_kina 
se 


Choline/ethanolamine kinase 


2e-67 


237.4 


277 


Ribosomal_S7 


Ribosomal protein S7p/S5e 


3 .3e-20 


80.6 


279 


pkinase 


Eukaryotic protein kinase 
domain 


3 .3e-77 


269.9 


280 


WD4 0 


WD domain, G-beta repeat 


7 . Be-73 


255.4 


281 


WD4 0 


WD domain, G-beta repeat 


7.8e-73 


255.4 


284 


zf-DHHC 


DHHC zinc finger domain 


4 .6e-24 


93 .4 


287 


Exonuc lease 


Exonucleaee 


1 .4e-67 


238.0 


291 


SAM 


SAM domain (Sterile alpha 
motif) 


0.034 


11.2 


292 


SAM 


SAM domain (Sterile alpha 
motif) 


0.034 


11.2 


294 


zf -C2H2 


Zinc finger, C2H2 type 


1 .4e-29 


111.7 


295 


zf-C2H2 


Zinc finger, C2H2 type 


2 .2e-125 


430.0 


296 


mito_carr 


Mitochondrial carrier proteins 


4.1e-59 


205.5 


297 


HMGJdox 


HMG (high mobility group) box 


6.7e-29 


109.4 


302 


Glycos trans 
f_4 


Glycosyl transferase 


5e-87 


302 . 5 


304 


tRNA-synt_2 


tRNA synthetases class II (D, K 
and N) 


l.le-84 


294.8 


305 


KRAB 


KRAB box 


2e-44 


161.0 


306 


rrm 


RNA recognition motif. 


2.7e-44 


160.6 


308 


7tm__l 


7 transmembrane receptor 
(rhodopsin family) 


5.2e-39 


12*. 1 


309 


DNA_p o 1 yme r a 
seX 


DNA polymerase X family 


2.4e-64 


227.2 


311 


F-box 


F-box domain. 


9.5e-08 


39.2 


312 


19 


Immunoglobulin domain 


6.8e-19 


6*. 9 


313 


Ets 


Ets -domain 


B.le-60 


192 .3 


315 


Kelch 


Kelch motif 


1.3e-106 


367.6 


317 


arf 


ADP-ribosylation factor family i 


3 .2e-35 


130.4 


318 


sugar_tr 


Sugar (and other) transporter 


0.0003 


-73 .1 


320 


pkinase 


Eukaryotic protein kinase 
domain 


8.1e-83 


288 .6 


322 


pkinase 


Eukaryotic protein kinase 
domain 


4 .9e-81 


282.6 


324 


xlink 


Extracellular link domain 


4.5e-143 


331.5 


326 


ARID 


ARID DNA binding domain 


5.1e-37 


136.4 


327 


HMG_box 


HMG (high mobility group) box 


6.7e-29 


109.4 


32B 


cadherin 


Cadherin domain | 


8.1e-81 


281.9 


331 


chromo 


'chromo 1 (CHRromatin 
Organization Modifier) 


4e-18 


66.7 


333 


Peptidase M2 
2 


Glycoprotease family 


1.2e-136 


467.4 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


335 


vwa 


von Willebrand factor type A 
domain 


2 . 3e-07 


37 . 9 


339 


ras 


Ras family 


7 . 8e-07 


-59.1 


340 


zf-C2H2 


Zinc finger, C2H2 type 


8 . 2e-64 


225.4 


342 


zf -C2H2 


Zinc finger, C2H2 type 


2 .4e-B5 


297.0 


343 


ig 


Immunoglobulin domain 


0.0005 


18.0 


346 


pkinase 


Eukaryotic protein kinase 
domain 


6 .5e~65 


229.1 | 


347 


pkinase 


Eukaryotic protein kinase 
domain 


"eJ.5e-65 


229.1 


351 


EGF 


EGF-like domain 


8.5e-20 


79.2 


352 


ank 


Ank repeat 


2,5e-101 


350.0 


354 


TBC 


TBC domain 


5.1e-15 


63.3 


355 


PHD 


PHD- finger 


3 .2e-07 


37.4 


358 


DUF6 


Integral membrane protein DUF6 


0 .033 


15 .8 


359 


zf-C2H2 


Zinc finger, C2H2 type 


7.4e-20 


79 . 4 


361 


ank 


Ank repeat 


6 .6e-34 


126 .1 


362 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


4 .7e-53 


189.7 


363 


ef hand 


EF hand 


5 .4e-10 


46. 6 


367 


LRR ■ 


Leucine Rich Repeat 


8 . Se-44 


158.9 


368 


laminin G 


Laminin G domain 


1 .5e-33 


121.7 


369 


PP2C 


Protein phosphatase 2C 


5 . 3e-20 


73 . 9 


372 


LIM 


LIM domain containing proteins 


9 . 9e-15 


57 . 1 


373 


KRAB 


KRAB box 


4 . 8e-23 


90.0 . 


3 76 


ion_ trans 


Ion transport protein 


2 . 9e-09 


-4 . 2 


377 


Beach 


Beige /BEACH domain 


4 . 9e-208 


704 . 5 


380 


pkinase 


Eukaryotic protein kinase 
domain 


1 . 6e-94 


327 . 5 


381 


AMP-binding 


AMP-binding enzyme 


1 .4e-07 


-140 .3 


382 


HECT 


HECT- domain (ubiquitin- 
transf erase) . 


1.3e-07 


-13 .5 


384 


ank 


Ank repeat 


2 .5e-l-01 


350 . 0 


386 




Immunoglobulin domain 


9 .5e-06 


23 . 6 


388 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-42 


154 . 6 


389 


ig 


immunoglobulin domain 


2.8e-15 


54.3 


390 


mito^carr 


Mitochondrial carrier proteins 


3.5e-67 


233 .2 


392 


TPR 


TPR Domain 


6.1e-17 


69 . 7 


393 


SH3 


SH3 domain 


3.5e-09 


43 . 9 


394 


AAA 


ATPases associated with various 
cellular act 


4 .le-21 


83 . 6 


396 


spectrin 


Spectrin repeat 


2.1e-67 


23 7. 3 


397 


zf-C2H2 


Zinc finger, C2H2 type 


0.0066 


23 . 1 


399 


£n3 


Fibronectin type III domain 


4 .le-102 


352.6 


400 


WD40 


WD domain, G-beta repeat 


0 .00049 


26 . 8 


401 


El dehydrog 


Dehydrogenase El component 


3e-119 


409.6 


402 


£n3 


Fibronectin type III domain 


0 


1719 . 6 


404 


LRR 


Leucine Rich Repeat 


2 . le-10 


48 . 0 


405 


cadherin 


Cadherin domain 


8 .le-81 


281 . 9 


406 


zf-CXXC 


CXXC zinc finger 




63 .4 


410 


RhoGEF 


RhoGEF domain 


1. le-23 


92 . 1 


411 


F-box 


F-box domain . 


4 .2e-06 


33 . 7 


412 


SNF2_N 


SNF2 and others N- terminal 
domain 


5.8e-l6 


61.6 


415 


CPSase_L_cha 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


1.5e-172 


586. e 


418 


LRR | 


Leucine Rich Repeat 


3.8e-24 


93 .6 


419 


DENN 


DENN (AEX-3) domain 


2e-58 


207.5 [ 


420 


RasGEF 


RasGEF domain 


8.1e-43 


1^.7 


421 


ank 


Ank repeat 


1.4e-153 


523.7 


424 


G -patch 


G-patch domain 


le-19 


78 . 9 


425 


pkinase 


Eukaryotic protein kinase 
domain 


2.2e-31 


117 .1 


426 


Plexin repea 
t 


Plexin repeat 


0.0023 


24 .6 


427 


Plexin_repea 


Plexin repeat 


0.0023 


24 . 6 
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DESCRIPTION 


p- value 


PFAM 
SCORE 




t 








429 


2f-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


8.6e-ll 


39.2 


431 


DEAD 


DEAD/DEAH box helicase 


le-66 


214.0 


432 


SH3 


SH3 domain 


3.4e-16 


67.2 


433 


GTP_CDC 


Cell division protein 


2.1S-114 


393.5 


436 


Collagen 


Collagen triple helix repeat 
{20 copies) 


4.6e-194 


658.1 


438 


Ricin_B_lect 
in 


Similarity to lectin domain of 
ricin b 


0.0085 


10.5 


441 


Alpha adapt i 
n_C 


Alpha adaptin carboxyl- terminal 
domai 


1.2e-256 


866.0 


442 


Alpha adapt i 
n„C 


Alpha adaptin carboxyl- terminal 
domai 


1.8e-235 


795.7 


443 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


1.96-65 


230.9 


445 


LON 


ATP -dependent protease La (LON) 
domain 


0. 00012 


-17.1 


446 




Immunoglobulin domain 


0 . 00011 


20.1 


,451 


sushi 


Sushi domain (SCR repeat) 


1.4e-18 


75.2 


452 


fn3 


Fibronectin type III domain 


1 .5e-06 


35.2 


454 


pyridoxal_de 
C 


Pyridoxal - dependent 
decarboxylase conse 


8 . 3e-14 


50 . 3 


456 


kinesin 


Kinesin motor domain 


4 .9e-217 


734 .4 


457 


neur_chan 


Neurotransmitter-gated ion- 
channel 


le-175 


597.1 


458 


Josephin 


Josephin 


0 .0002 


18.7 


468 


bZIP 


bZIP transcription factor 


1.7e-07 


31.8 


470 


NTP_transfer 
ase 


Nucleotidyl transferase 


6 .3e-06 


-26.3 


471 


WD40 


WD domain, G-beta repeat 


2e-28 


107.9 


473 


LIM 


LIM domain containing proteins 


0.00021 


20.7 


477 


zf-RanBP 


Zn- finger in Ran binding 
protein and others. 


0.028 


21 . 0 


479 


WD40 


WD domain, G-beta repeat 


6.5e-18 


73.0 


480 


KRAB 


KRAB box 


le-31 


118.8 


481 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


8.4e-66 


232.0 


485 


SH2 


Src homology domain 2 


0.011 


11 .4 


486 


Clq 


Clq domain 


4.30-74 


259.6 


487 


dsrm 


Double- stranded RNA binding 
motif 


l.le-47 


171. 9 


4B9 


zf-C2H2 


Zinc finger, C2H2 type 


4.8e-153 


521.9 


490 


Alpha_adapti 
n_C 


Alpha adaptin carboxyl -terminal 
domai 


3.4e-222 


751.6 


492 


SKI 


Shikimate kinase 


1.2e-10 


48.8 


497 


ENVjpolyprot 
ein 


ENV polyprotein (coat 
polyprotein) 


2.6e-22 


77.6 


498 


abhydrolase 
2 


Phospholipase/Carboxylesterase 


0.041 


-48.1 


500 


rrm 


RNA recognition motif. 


5.4e-34 


126.4 


501 


WW 


WW domain 


4.6e-18 


73 .4 


502 




Immunoglobulin domain 


l.ie-io 


39.5 


504 


abhydrolase 


alpha/beta hydrolase fold 


0 .045 


-3.6 


505 


vwa 


von Willebrand factor type A 
domain 


7.1e-62 


219.0 


508 


Na K ATPase 
C 


Na+/K+ ATPase C- terminus 


2.3e-145 


496.3 


509 


Exonuclease 


Exonuclease 


1.3e-56 


201. 5 


510 


Glycos trans 
f_l 


Glycosyl transferases group l 


2.9e-06 


27.0 


511 


Glycos trans 
f_l 


Glycosyl transferases group 1 


2.9e-06 


27.0 


512 


Glycos trans 
f_l 


Glycosyl transferases group l 


1.9e-09 


38.5 


514 


pro_isomeras 
e 


Cyclophilin type pep t idyl - 
prolyl cis-tr 


1.8e-63 


221.4 
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NO: 


PFAM NAME 


DESCRIPTION 


p -value 


PFAM 
SCORE 


515 


EGF 


EGF- like domain 


1.9e-18 


74 .7 


516 


Surp 


Surp module 


4.3e-38 


140.0 


523 




Immunoglobulin domain 


3.3e-06 


25.0 


526 


UBX 


UBX domain 


l.le-34 


128 .6 


528 


adhjzinc 


Zinc-binding dehydrogenases 


2,7e-34 


127 .4 


530 


SAM 


SAM domain (Sterile alpha 
motif) 


0.046 


10.0 


531 


adh_short 


short chain dehydrogenase 


0.0025 


-34.1 


532 


mito__carr 


Mitochondrial carrier proteins 


2.5e-81- 


281.7 


533 


mito_carr 


Mitochondrial carrier proteins 


2e-6l 


213.5 


534 


thiolase 


Thiolase 


3.5e-183 


622.0 


535 


FMO-like 


Flavin-binding monooxygenase- 
like 


0 


1153 . 7 


536 


SCAN 


SCAN domain 


4e-55 


196.6 


53 7 


tRNA~synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3.1e-136 


466.0 


538 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3.1e-136 


466 .0 


539 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


1.9e-117 


403 .6 


540 


tRNA-synt_l 


tRNA synthetases class I (I, L, 
M and V) 


3.1e-136 


46G.0 


541 


vATP-synt__E 


ATP synthase (E/31 kDa) subunit 


5.9e-85 


295.7 


543 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-«;$ 


242.6 


544 


DUP101 


Protein of unknown function 
DUF101 


8.5e-38 


139.0 


545 


TGFbjpropept 
ide 


TGF-beta propeptide 


l.le-67 


238.2 


547 


WD4 0 


WD domain, G-beta repeat 


2.6e-32 


120.8 


548 


RHD 


Rel homology domain (RHD) . 


■ 1.6e-238 


686 .2 


549 


MMR_HSR1 


GTPaee of unknown function 


5.4e-67 


236.0 


551 


HECT 


HECT-domain (ubiguitin- 
transferase) . 


4.3e-127 


435.6 


554 


MHC_II_alpha 


Class II histocompatibility 
antigen, alp 


3.5e-74 


259.8 


555 


zf-UBRl 


Putative zinc finger in N- 
recognin 


3 .3e-16 


67.3 


556 


Kelch 


Kelch motif 


5.5e-29 


109.7 


561 


AMP-binding 


AMP-binding enzyme 


2.Be-06 


-163.7 | 


562 


PABP 


Poly- adenylate binding protein, 
unique domai 


4.9e-38 


139.8 


564 


Gag_p3 0 


Gag P30 core shell protein 


1.2e-67 


238.2 


566 


PWWP 


PWWP domain 


8.1e-l6 


66.0 


567 


SCAN 


SCAN domain 


7.3e-68 


238.9 


569 


pkinase 


Eukaryotic protein kinase 
domain 


1.5e-84 


294.3 [ 


570 


pkinase 


Bukaryotic protein kinase 
domain 


1.5e-84 


294.3 


571 


CN_hydrolase 


Carbon- nitrogen hydrolase 


0 .000B1 


-79.7 


572 


myosin_head 


Myosin nead (motor domain) 


0 


1495.2 


573 


myosirMiead 


Myosin head (motor domain) 


0 


1490.4 


575 


Surp 


Surp module 


1.7e-23 


91.5 


576 


Surp 


Surp module ( 


1.7e-23 


91 .5 


577 


DNAjpol B 


DNA polymerase family B 


0 


1138.6 


578 


PDZ 


PDZ domain (Also known as DHR 
or GIX3F) . 


8.3e-09 


42.7 


579 


LRR 


Leucine Rich Repeat 


4 ,9e-21 


83 .3 


580 


neur_chan 


Neurotransmitter-gated ion- 
channel 


5.9e-177 


601.3 


583 


sushi. 


Sushi domain (SCR repeat) 


0 


1673 .0 


584 


DEAD 


DEAD/DEAH box he li case 


7.3e-36 


116.3 


586 


KH-domain 


KH domain 


2.9e-13 


57.5 


587 


G-patch 


G-patch domain 


2.3e-14 


61.2 


589 


LIM 


LIM domain containing proteins 


2.3e-36 


133.4 


590 


bromodomain 


Bromodomain 


6.6e-32 


114.7 


591 


bromodomain 


Bromodomain 


6.6e-32 


114 . 7 
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PFAM NAME 
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PFAM 
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592 


hormone_rec 


Ligand-binding domain of 
nuclear hormone 


3.5e-22 


87.1 


593 


PHD 


PHD- finger 


3.8e-12 


53.8 


594 


cadherin 


Cadherin domain 


4.2e-99 


342 .7 


596 


pkinase 


Eukaryotic protein kinase 
domain 


5e-92 


319.2 


597 


WD40 


WD domain, G-beta repeat 


0.00054 


26.7 


600 


FG-GAP 


FG-GAP repeat 


4.3e-75 


262.9 


602 


G_Adapt_CT 


Gamma -adapt in, C- terminus 


l.le-53 


191.8 


603 


pkinase 


Eukaryotic protein kinase 
domain 


2.3e-86 


300.4 


605 


Collagen 


Collagen triple helix repeat 
(20 copies) 


8e-42 


152.4 


606 


mito_carr 


Mitochondrial carrier proteins 


6.3e~67 


232.3 • 


608 


PWWP 


PWWP domain 


2.6e-28 


107.5 


609 


PWWP 


PWWP domain 


2.6e-28 


107.5 


613 


CAPJ3LY 


CAP-Gly domain 


0.0046 


20.1 


615 


RFX_DNA_bind 
ing 


RFX DNA- binding domain 


5.2e-54 


192.9 


616 


kineain 


Kinesin motor domain 


l.le-81 


284.8 


617 


kinesin 


Kinesin motor domain 


8.4e-80 


278.5 


618 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.0098 


13 .1 


620 


MATH 


MATH domain 


7.8e-05 


22 .2 


621 


Y_phosphatas 
e 


Protein-tyrosine phosphatase 


1.4e-32 


121.6 


622 


pkinase 


Eukaryotic protein kinase 
domain 


4 .4e-40 


146.6 


623 


BNR 


BNR repeat 


2.1e-ll 


51.3 


624 


molybdopteri 
n 


Prokaryotic molybdopterin 
oxidoreductas 


1.4e-12 


42.2 


62*' 


TPR 


TPR Domain 


l.le-17 


72.2 


627 


cNMP_binding 


Cyclic nucleotide-binding 
domain 


3.7e-58 


206.6 


630 


adh_short 


short chain dehydrogenase 


5e-17 


70.0 


631 


zf-C2H2 


Zinc finger, C2H2 type 


2.1e-88 


307.1 


£32 


rrm 


RNA recognition motif. 


4e-05 


30.5 


635 


pkinase 


Eukaryotic protein kinase 
domain 


1.6e-104 


360.7 


636 


Fork_head 


Fork head domain 


5.9e-27 


103.0 


637 


pkinase 


Eukaryotic protein kinase 
domain 


3.8e-70 


246.5 


642 


TPR 


TPR Domain 


4.8e-08 


40.1 


643 


efhand 


EF hand 


1.9e-27 


104.6 


647 


SNF2_N 


SNF2 and others N-terminal 
domain 


1.2e-101 


351.1 


648 


PseudoU eynt 
h_2 


RNA pseudouridylate synthase 


1.9e-55 


197.6 


650 


zf-C2H2 


Zinc finger, C2H2 type 


0.0087 


22.7 


651 


ank 


Ank repeat 


1.3e-17 


71.9 


652 


I_LWEQ 


I/LWEQ domain 


9.5e-101 


341.0 


653 


neur_chan 


Neurotransmitter-gated ion- 
channel 


4.1e-171 


581.8 


654 


tsp_l 


Thrombospondin type l domain 


4 .le-47 


169.9 


659 


FH2 


Formin Homology 2 Domain 


le-107 


371.2 


661 


pou 


Pou domain - N-terminal to 
homeobox domain 


5.3e-45 


162.9 


662 


C2 


C2 domain 


6.7e-19 


76 .2 


663 


C2 


C2 domain 


6.7e-19 


74.2 * 


664 


C2 


C2 domain 


6.7e-19 


76 .2 


Ml 


GST 


Glutathione S- transferases . 


9.3e-34 


114.4 j 


668 


LRR 


Leucine Rich Repeat 


9.3e-31 


115.6 


670 


spectrin 


Spectrin repeat 


4e-57 


203 .2 


671 


IJUWEQ 


I/LWEQ domain 


9.5e-101 


341.0 


672 


ABC tran 


ABC transporter 


5.3e-60 


212.8 


674 


WD40 


WD domain, G-beta repeat 


4 .8e-24 


93.3 



253 



WO 01/53312 PCT/US00/34263 



SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 
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675 


WD40 


WD domain, G-beta repeat 


4 .Be-24 


93.3 


676 


LKK 


Leucine Rich Repeat 


0 .0015 


25.2 


679 


z£-CCCH 


Zinc finger C-x8-C-xS-C-x3-H 
type 


2.6e-29 


107 .7 


680 


zf-C2H2 


Zinc finger, C2H2 type 


5.2e-05 


30.1 


681 


CH 


Calponin homology (CH) domain 


2.4e-17 


71.1 


682 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


4.3e-43 


156.6 


683 


zf-C3HC4 


Zinc finger; C3HC4 type (RING 
finger) 


0 .051 


10 . 8 


687 


Synapsin 


Synapsin 


0 


1B90.8 


689 


PR55 


Protein phosphatase 2A 
regulatory subunit PR 


0 


1038.8 


691 


homeobox 


Homeobox domain 


8.5e-30 


112.4 


696 


Peptidase_M2 
4 


metallopeptidase family M24 


2.6e-59 


210.5 


697 


RnoGEF 


RhoGEP domain 


9.5e-35 


128.9 


698 


PHD 


PHD- finger 


0.008 


9.3 


701 


zf-C2H2 


Zinc finger, C2H2 type 


5.5e-123 


422.0 


702 


Sulfatase 


Sulfatase 


3e-231 


781.6 


703 


zf-C2H2 


Zinc finger, C2H2 type 


5.7e-20 


79.8 


707 


Acyl_transf 


Acyl transferase domain 


l.le-22 


88.8 


708 


WD4 0 


WD domain, G-beta repeat 


4 . 8e-19 


76. 7 


710 


Ran_BPl 


RanBPl domain. 


8 .4e-06 


-7.3 


713 


DEAD 


DEAD/DEAH box helicase 


9.9e-42 


134.9 


714 


PH 


PH domain 


1.6e-09 


39.0 


715 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


1.5e-37 


138.2 


717 


Sialyltransf 


Sialyltransferase family 


7 .5e-31 


115.9 


71B 




Immunoglobulin domain 


le-29 


100.8 


719 


integrin_B 


Integrins, beta chain 


0 


1125.4 


720 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


l.le-08 


32.4 


722 


Peptidase__C2 


Calpain family cysteine 
protease 


3e-145 


495.9 


723 


ig 


Immunoglobulin domain 


2 .2e-05 


22.4 


724 


F-box 


F-box domain. 


0.007 


23.0 


725 


Nop 


Putative snoRNA binding domain 


8 .le-58 


205.5 


726 


Nop 


Putative snoRNA binding domain 


8.1e-58 


205. £ 


727 


WD4 0 


WD domain, G-beta repeat 


7.5e-26 


99.3 


730 


dsrm 


Double- stranded RNA binding 
motif 


0.027 


12.1 


731 


dynamin 


Dynamin family 


4.2e-16 


66.9 


733 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2.8e-10 


41.7 


735 


CDP- 

OH_P_transf 


CDP-alcohol 

phosphatidyl transferase 


4.2e-26 


100.1 


738 


DEAD 


DEAD/DEAH box helicase 


8.6e-57 


182.5 


739 


TSC22 


TSC-22/dip/bun family 


6.5e-32 


119.5 


742 


ras 


Ras family 


2 .2e-100 


346.9 


743 


PMI_typeI 


Phosphomannose i some rase type I 


1.2e-243 


822.9 


747 


trypsin 


Trypsin 


6 .4e-8B 


279.4 


748 


kazal 


Kazal-type serine protease 
inhibitor domain 


2.2e-52 


187.4 


749 


ethand 


EF hand 


6.3e-06 


33 .1 


751 


PHD 


PHD- finger 


4.9e-l6 


66.7 


752 


zf-C2H2 


zinc finger, C2H2 type 


3.2e-21 


83.9 


/o3 


Hydrolase 


naioacid dehalogenase-like 
hydrolase 


6.1e-ll 


49.8 


754 


Ribosomal L3 
9 


Ribosomal L3 9 protein 


0.00018 


26.7 


755 


PH ' 


PH domain 


3.6e-14 


55.7 


758 


SCAN 


SCAN domain 


1.4e-53 


191.5 


759 


PA 


PA domain 


0.0065 


23 .1 


760 


art 


ADP-ribosylation factor family 


2.2e-l9 


77.8 


761 


CIDE-N 


CIDE-N domain 


2.2e-40 


147.6 
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PFAM 
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762 


histone 


Core histone H2A/H2B/H3/H4 


9.9e-53 


188.6 


763 


zf-MYND 


MYND finger 


4 .le-14 


60.3 


764 


pou 


Pou domain - N- terminal to 
homeobox domain 


le-52 


188.6 


767 


vwc 


von Willebrand factor type C 
domain 


2.9e-34 


127.3 


769 


efhand 


EF hand 


4 .8e-ll 


50.1 


770 


zf-C4 


Zinc finger, C4 type (two 
domains) 


2.4e-53 


181.6 


772 


rae 


Ras family 


7e-90 


312.0 


773 


Sultatase 


Sulf atase 


le-142 


487.5 


775 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


776 


zf-C2H2 


Zinc finger, C2H2 type 


i.ie-12 


55.5 


777 


zf-C2H2 


Zinc finger, C2H2 type 


l.le-12 


55.5 


778 


rrm 


RNA recognition motif. 


2.1e-32 


121.1 i 


779 


G6PD 


Glucose - 6 - phosphate 
dehydrogena se 


1.5e-76 


236.6 


780 


spectrin 


Spectrin repeat 


3 .7e-29 


110.3 


781 


mito^carr 


Mitochondrial carrier proteins 


4 .6e-57 


198.5 


782 


SCAN 


SCAN domain 


1.3e-24 


95.2 


783 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


4 .le-07 


37.1 


785 


DEAD 


DEAD/DEAH box helicase 


6e-06 


21. 7 I 


786 


ras 


Ras family 


5.3e-39 


143.0 


787 


RNase_HII 


Ribonuclease HII 


2.5e-67 


237.1 


790 


PI3_PI4_kina 
se 


Phosphatidyl inositol 3- and 4- 
kinases 1 


5.4e-108 


372.2 


795 


cadherin 


Cadherin domain 


2 .5e-40 


147.4 | 


796 


ARID 


ARID DNA binding domain 


1.6e-20 


81.6 


797 


trypsin 


Trypsin 


9.9e-20 


64.8 


799 


CH 


Calponin homology (CH) domain 


3.7e-15 


63.8 


801 


Gal- 

bind_lectin 


Vertebrate galactoside-binding 
lectin 


4.1e-25 


88.7 


803 


WD40 


WD domain, G-beta repeat 


0.00082 


26.1 


806 


TBC 


TBC domain 


1.8e-26 


101.4 


807 


TBC 


TBC domain 


1.8e-26 


101.4 


808 


CN_hydrolase 


Carbon-nitrogen hydrolase 


8.8e-80 


278.5 


811 


CBFD_JNfFYB HM 
F 


Histone-like transcription 
factor 


6e-14 


59.8 


812 


adh_short 


short chain dehydrogenase 


8.1e-20 


79.3 


814 


IMP4 


Domain of unknown function 


3.36-71 


250.0 


815 


zf-C2H2 


Zinc finger, C2H2 type 


8.2B-66 


232.1 


816 


Pept__tRNA_hy 
dro 


Peptidyl-tRNA hydrolase 


1.6e-37 


138.0 


817 


ARID 


ARID DNA binding domain 


2.5e-18 


74.3 


826 


IF5_eIF4 elF 
2 


eIF4-gamjna/eIF5/eIF2-epsilon 


1. 6e-32 


121 .5 


830 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


1.5e-53 


191 .3 


831 


LRR 


Leucine Rich Repeat 


2 . le-26 


101 .1 


832 


laminin_EGF 


Laminin EGF-like (Domains III 
and V) 


2e-57 


204 .2 


839 


rrm 


RNA recognition motif. 


1.3e-22 


B6.fi 


840 


Y__phosphatas 
e 


Protein- tyrosine phosphatase 


2.6e-119 


409! 8 


841 


pkinase 


Eukaryotic protein kinase 
domain 


3 .4e-100 


346 . 3 


844 


Ribosomal L2 
2e 


kibosomal L22e protein family 


le-64 


228 .4 


846 


I BR 


I BR domain 


9e-15 


62.5 


849 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7.4e-07 


26.5 


850 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


0.00016 


18.9 


851 


SET 


SET domain 


5e-30 


113 .2 


852 


SRCR 


Scavenger receptor cysteine - 


0 


1025.4 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 






rich domain 






853 


SRCR 


Scavenger receptor cysteine - 
rich domain 


0 


1025.4 


657 


lactamase_B 


Metallo-beta- lactamase 
superfamily 


0.012 


-6.0 


858 


COX6A 


Cytochrome c oxidase subunit 
Via 


3 .4e-58 


206.7 


B59 


rrm 


RNA recognition motif. 


5.4e-45 


162 .9 


861 


PRK 


Phosphoribulok inase 


5.1e-62 


219.4 


863 


mito_carr 


Mitochondrial carrier proteins 


2.9e-53 


185.5 


864 


HSP90 


Hsp90 protein 


4.7e-158 


538.5 


866 


*9 


Immunoglobulin domain 


4e-12 


44.1 


867 


zf-C2H2 


Zinc finger, C2H2 type 


7e-135 


461.5 


872 


histone 


Core histone H2A/H2B/H3/H4 


4 .9e-41 


149.8 


874 


CPSase L cha 
in 


Carbamoyl -phosphate synthase 
(CPSase) 


2.1e-218 


739.0 


879 


Ribosomal SI 
2e 


Ribosomal protein Sl2e 


2.1e-98 


340 .3 


882 


serpin 


Serpins (serine protease 
inhibitors) 


2.5e-42 


145.7 


883 


Patatin 


Patatin 


1.2e-51 


182 . 0 


884 


RA 


Ras association (RalGDS/AF- 6 ) 
domain 


0.044 


8.0 


887 


DUF92 


Integral membrane protein DUF92 


2.7e-12 


54.3 


889 


sugar_tr 


Sugar (and other) transporter 


8.2e-63 


222 .1 


893 


DUF28 


Domain of unknown function 
DUF28 


1.3e-43 


158 .3 


896 


IP_trans 


Phosphatidyl inositol transfer 
protein 


6.5e-98 


338 .7 


898 


DEAD 


DEAD/DEAH box helicase 


1.5e-48 


156.5 


899 


KE2 


KE2 family protein 


7e-61 


215.7 


900 


KE2 


KE2 family protein 


4 .3e-51 


183.2 


901 


2f-C2H2 


Zinc finger, C2H2 type 


2.7e-57 


203.8 


902 


ras 


Ras family 


2 .3e~75 


263 .8 


904 


TPR 


TPR Domain 


3 .2e-22 


87.2 


906 


GBP 


Guanylate-binding protein 


8.9e-253 


853 .1 


907 


GBP 


Guanylate-binding protein 


l.le-239 


809.6 


908 


WD40 


WD domain, G-beta repeat 


2.6e-26 


100.8 


909 


PH 


PH domain 


1.3e-09 


39.4 


910 


2f-C2H2 


Zinc finger, C2H2 type 


2.5e-39 


144.1 


913 


Epimerase 


NAD dependent 

epimerase/dehydratase family 


5e-07 


-88.5 


921 


TBC 


TBC domain 


1 .5e-09 


30 .7 


922 


WD40 


WD domain, G-beta repeat 


1.6e-25 


98 .2 


923 


WD4 0 


WD domain, G-beta repeat 


8 .2e-07 


36.1 


924 


Hydrolase 


haloacid dehalogenase- like 
hydrolase 


2.9e-05 


29.1 


925 


UQ con 


Ubiqui tin -conjugating enzyme 


0 .00033 


-27.6 


926 


CH 


Calponin homology (CH) domain 


3 .3e-53 


190.2 


928 


WD4 0 


WD domain, G-beta repeat 


5.9e-48 


172.7 


929 


2f-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


3 .le-io 


37.4 


930 


Ribul P 3 ep 
im 


Ribulose-phosphate 3 epimerase 
family 


7.2e-105 


361.8 


931 


Ribul_P__3_ep 
im 


Ribulose -phosphate 3 epimerase 
family 


1 .2e-96 


334.4 


936 


C2 


C2 domain 


2.2e-62 


220.7 


937 


NAP_family 


Nucleosome assembly protein 
(NAP) 


l.le-22 


84. £ 


940 


abhydrolase 


alpha/beta hydrolase fold 


0.011 


3.1 


944 


Tropomyosin 


Tropomyosins 


3 .2e-07 


25.1 


94 8 


pkinase 


Eukaryotic protein kinase 
domain 


3 ,4e-75 


263 .2 


949 


WD4 0 


WD domain, G-beta repeat 


1 .8e-27 


104 .7 


950 


Acyl transfer 
ase 


Acyl transferase 


1 .6e-07 


38.4 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


951 


SAM 


SAM domain (Sterile alpha 
motif) 


0.014 


14 .5 


954 


GFO IDH MocA 


Oxidoreductase family 


1.3e-ll 


52.0 


955 


BTB 


BTB/POZ domain 


7e-22 


86.1 


956 


BTB 


BTB/POZ domain 


7e-22 


86 .1 


957 


CDP- 

OH P transf 


CDP-alcohol 

phosphatidyl transferase 


0.053 


-22.2 


959 


ras 


Ras family 


2.4e-97 


336.8 


960 


ras 


Ras family 


8.4e-43 


155.6 


961 


Acetyl transf 


Acetyltransf erase (GNAT) family 


1.2e-08 


42.2 


962 


adh_short 


short chain dehydrogenase 


2.4e-31 


117.6 


963 


mutT 


Bacterial mutT protein 


5.6e-06 


26.2 


969 . 


IF-2B 


Initiation factor 2 subunit 
family 


8.4e-193 


653.9 


970 


RNase PH 


3 1 exoribonuclease family 


9e-24 


92.4 


975 


WW 


WW domain 


5.7e-2S 


96.4 


977 


PDZ 


PDZ domain {Also known as DHR 
or GLGF) . 


3 .6e-21 


83.7 


978 


Ribosomal_Ll 
7 


Ribosomal protein L17 


2.4e-20 


81.0 


979 


LIM 


LIM domain containing proteins 


5.8e-42 


152.0 


980 


Calsequestri 
n 


Calsequestrin 


1, 7e-297 


1001.7 ~ 


982 


HSP20 


Hsp20/alpha crystallin family 


1.2e-10 


43.2 


983 


oxidored__q6 


NADH ubiquinone oxidoreductase, 
20 Kd sub 


4.8e-63 


222.9 - 


988 


TBC 


TBC domain 


2.2e-50 


180.8 


989 


TBC 


TBC domain 


2.2e-50 


180.8 i 


993 


tRNA_int_end 
o 


tRNA intron endonuclease 


0.0017 


-34.2 


994 


homeobox 


Homeobox domain 


4e-18 


73.6 


997 


pyr__redox 


Pyridine nucleotide-disulphide 
oxidoreducta 


0 .012 


11.6 


1000 


mito_carr 


Mitochondrial carrier proteins 


9.7e-123 


421.2 


1001 


RA 


Ras association ( RalGDS/AF- 6 ) 
domain 


1.2e-15 


45.4 


1004 

* 


• DUF81 


Domain of unknown function 
DUF81 


0.099 


10.2 


1005 


actin 


Actin 


1 .3e-174 


574.3 


1006 


actin 


Actin 


3 .le-130 


428.6 


1007 


cpn60_jTCPl 


TCP-l/cpn60 chaperonin family 


3 .7e-195 


661.8 


1008 


TPR 


TPR Domain 


8 .le-44 


159.0 


1009 


zf -C2H2 


Zinc finger, C2H2 type 


3 .6e-61 


216.6 


1011 


zf-C2H2 


Zinc finger, C2H2 type 


3.6e-61 


216.6 


1012 


zf-C3HC4 


Zinc finger, C3HC4 type (RING " 
finger) 


4.7e-15 


53.1 


1016 


tRNA-synt_2c 


tRNA synthetases class II (A) 


2.3e-15 


55.2 


1018 


RhoGAP 


RhoGAP domain 


1.6e-78 


274 .3 


1022 


PGAM 


Phosphoglycerate mutase family 


3 .8e-18 


69.7 


1025 


HMG box 


HMG (high mobility group) box 


8 .4e-20 


79.2 


1027 


TBC 


TBC domain 


7.3e-45 


162.5 


1028 


UQ_con 


Ubiquitin-conjugating enzyme 


1.4e-49 


178.1 


1032 


PDZ 


PDZ domain (Also known as DHR 
or GLGF) . 


0 . 028 


l£.3 


1034 


Hydrolase 


haloacid dehalogenase-like 
hydrolase 


2e-21 


84 .6 


1037 


KRAB 


KRAB box 


4 .8e-06 


32.4 


1038 


Cation_ef f lu 

X 


Cation efflux family 


7.1e-42 


152.5 


1040 


ART 


NAD:arginine ADP- 
ribosyltransf erase 


4 .7e-47 


169.1 


1042 


WD40 


WD domain, G-beta repeat 


1 ,9e-l8 


74 . 7 


1043 


zf-C2H2 


Zinc finger, C2H2 type 


3 .7e-24 


93 .7 


1045 


lectin_c 


Lectin C-type domain 


1.9e-28 


108 .0 


1046 


Glucosamine^ 
ieo 


Glucosamine - 6 -phospha t e 
isomerase 


0.00013 


-25.1 
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PFAM NAME 


DESCRI PTION 


p-value 


PFAM 
SCORE 




ligase-CoA 


CoA-ligases 


4 . 5e-80 


279.4 


1049 




Immunoglobulin domain 


1. 7e-09 


35.6 


1050 


Ribosomal_L2 
i e 


Ribosomal protein L2 4e 


2e-33 


124.5 


1054 


Amidase 


Amidase 


4.3e-152 


518.7 


1055 


rrm 


RNA recognition motif. 


3 .Be-26 


100.3 


lUJO 


annexin 


Annexin 


6.9e-44 


159.2 




PMP2 2_C1 audi 
n 


PMP-22/EMP/MP20/Claudin family 


0.023 


-23.6 


1060 


hotneobox 


Home ob ox doma i n 


3 . 2e-31 


117 .2 


1062 


nLyiLiansLcr 
ES6 


Acyl transferase 


0 . 00065 


10.5 


io£4 


jun f - o i n a x ng 


AMP-binding enzyme 


6 . 6e-100 


345.3 


106S 


Utvtv 


Leucine Rich Repeat 


3 , 3e-14 


60,6 


1066 


0JTP1 ORH 
uin udu 


\j i r±/ \jd\3 tamiiy 


A K— 7"= 

4 . Be-41 


141 . 8 


1071 


ig 


Immunoglobulin domain 


8.4e-48 


159.1 


1072 


DUD 


phu- tinger 


6 . Be-07 


36.3 


X U / ft 


DENN 


DENN (AEX-3) domain 


8.3e-33 


121.5 


1075 


SCP 


SCP-like extracellular protein 


4.7e-41 


149.8 


1 m"7 


OLF 


Oltactomedin-like domain 


2 ,2e-66 


234.0 


1078 


mito carr 


Mitochondrial carrier proteins 


le-42 


149.3 


i mo 
iu /y 


WD40 


WD domain, G-beta repeat 


6.2e-45 


162.7 


1087 


START 


START domain 


1.5e-48 


174.7 


1093 


DSPC 


Dual specificity phosphatase, . 
catalytic doma 


3.3e-63 


223.4 


1094 


GSHPx 


Glutathione peroxidases 


9 . 6e-41 


148 .8 


1095 


DUF25 


Domain of unknown function 
DUF25 


2e-75 


264.0 


1096 


. DUF2 5 


Domain of unknown function 
DUF25 


6e-75 


"262.4 


1105 


Nitroreducta 
ee 


Nitroreductase family 


1.3e-13 


58.6 


1106 


PTE 


Phosphodiesterase family 


1 .3e-179 


610.1 


1107 


DAGKc 


Diacylglycerol kinase catalytic 
domain 


0. 00049 


19.6 


1109 


ras 


Ras family 


1.3e-15 


40.7 


1115 


ArfGap 


Putative GTP-ase activating 
protein for Arf 


9.7e-47 


168.7 


1116 


HMG14 17 


HMG14 and HMG17 


4 .4e-21 


83 .5 


1117 


HMG14 17 


HMG14 and HMG17 


9.9e-12 


52.4 


1119 


FAA_hydrol a s 
e 


Fumarylacetoacetate (FAA) 
hydrolase fam 


2e-83 


290.6 


inn i 


pkinase 


Eukaryotic protein kinase 
domain 


1.4e-94 


327.6 




abhydrolase 


alpna/oeta hydrolase fold 


9.2e-23 


89.0 


1129 


pro_isomeras 

6 


Cyclophilin type peptidyl- 
prolyl cis-tr 


2.2e-56 


197.1 


Till 


DnaJ 


DnaJ domain 


1.6e-30 


114.9 


1132 


WD40 


WD domain, G-beta repeat 


1.3e-19 


78.6 


1133 


WD40 


WD domain, G-beta repeat 


1.8e-15 


64.9 


1134 


PH 


PH domain 


0.0015 


17.8 


1136 


Adap_comp_su 
b 


Adaptor complexes medium 
subunit family 


1.2e-256 


866.0 


1137 


Adap comp su 
b 


Adaptor complexes medium 
subunit family 


2.5e-209 


708.8 


1139 


ras 


Ras family 


1.5e-86 


301.0 


1141 


pkmase 


Eukaryotic protein kinase 
domain 


9.4e-74 


258.4 


1152 


Acyl transfer 
ase 


Acyl transferase 


1.2e-05 


29.9 


1153 


IRS 


PTB domain URS-l.type) 


5.4e-55 


196.1 


1155 


ig 


Immunoglobulin domain 


1.3e-31 


106.9 


1157 


Asparaginase 
_2 


Asparaginase 


6.4e-72 


252 .3 


1159 


GMC oxred 


GMC oxidoreductases 


4 . 7e-142 


485.3 


1160 


z£-ANl 


ANl-like Zinc finger ~"T 


0.00021 


27.9 
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NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1163 


linkerjiisto 
ne 


linker hlstone Hi and H5 family 


3 .Be-14 


60.4 


1164 


DED 


Death effector domain 


3 .9e-05 


30.5 


1165 


IRS 


PTB domain {IRS-1 type) 


2.6e-43 


157.3 


1166 


IRS 


PTB domain {IRS-1 type) 


2.6e-43 


157.3 


1168 


SAM 


SAM domain {Sterile alpha 
motif) 


0.04 


10.5 


1170 


abhydrolase 


alpha/beta hydrolase fold 


0 . 098 


-7.5 


1174 


SAP 


SAP domain 


3.9e-10 


47 . 1 


1177 


PP2C 


Protein phosphatase 2C 


5.3e-31 


112.5 


1178 


WD40 


WD domain, G-beta repeat 


4.7e-35 


129.9 


1180 


Ets 


Ets-domain 


1.8e-09 


33.3 


1181 


Collagen 


Collagen triple helix repeat 
(20 copies) 


0.00016 


24.7 


1182 


TCL1_MTCP1 


TCL1/MTCP1 family 


9.5e-56 


198.6 


1184 


RasGEF 


RasGEF domain 


1.7e-88 


307.4 


1185 


mito carr 


Mitochondrial carrier proteins 


1.5e-62 


217.3 


1187 


UPAR_LY6 


u-PAR/Ly-6 domain 


0.0042 


15.6 


1188 


Orn_DAP Arg 
dec 


Pyridoxal - dependent 
decarboxylase 


6.2e-12B 


430.6 


1193 


Stathmin 


Stathmin family 


1.8e-90 


314 .0 


1194 


Stathmin 


Stathmin family 


1.8e-90 


314 .0 


1195 


Seel 


Seel family 


3.2e-183 


622.1 — 


1196 


pyr_redox 


Pyridine nucleotide -disulphide 
oxidoreducta 


3.1e-32 


111.8 


1197 


Glyco_transf 
_8 


Glycosyl transferase family 8 


1.2e-09 


45.5 


1202 


K_tetra 


K+ channel tetramerisation 
domain 


0.022 


-16.8 


1203 


adh_short 


short chain dehydrogenase 


8.3e-45 


162.3 "~ 


1206 


Ubie_methylt 
ran 


ubiE/C0Q5 methyl transferase 
family 


1.3e-121 


417.4 


1208 


7tm 3 


7 transmembrane receptor 


7.2e-09 


29.0 | 


1209 


ank 


Ank repeat 


3.9e-l5 


63.7 


1210 


vATP- 
synt__AC3 9 


ATP synthase (C/AC3 9) subunit 


2.5e-128 


"439.7 


1212 


ZI-C2H2 


Zinc finger, C2H2 type 


5.5e-17 


69.9 


1213 


ef hand 


EF hand 


3 .2e-07 


37.4 


1219 


rrm 


RNA recognition motif. 


2.1e-40 


147.7 


1220 


DUF6 


Integral membrane protein DUF6 


0.015 


21.5 


1222 


SCAN 


SCAN domain 


1.5e-71 j 


251.1 


1223 


G- gamma 


GGL domain 


3.6e-36 ! 


129.5 


1227 


catalase 


Catalase 


0 1 


1158.9 


1232 


PX 


PX domain 


2.2e-15 


64.5 


1233 


PX 


PX domain 


2.2e-15 


64.5 


1236 


FCH 


Fes/CIP4 homology domain 


3.3e-09 


44 .0 


1241 


Peptidase_M2 
0 


Peptidase family M20/M25/M40 


2e-63 


224.1 


1243 


WW 


WW domain 


0.044 


17.9 


1247 


UPF0006 


Metalloenzyme of unknown 
function UPF0006 


6.3e-6l j 


215.8 


1248 


Glycos trans 
f_2 


Glycosyl transferases 


4.5e-10 


46.9 


1249 


efhand 


EF hand 


4e-ll 


50.4 


1254 


UQ_con 


Ubiquitin-conjugating enzyme 


2.1e-73 


257.3 


1255 


ras 


Ras family 


2.2e-62 


220.7 


1256 


formyl_trans 
f 


Formyl transferase 


4 .9e-30 


108.3 


1259 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.3e-13 


46.4 


1261 


DiHfolate re 
d 


Dihydrofolate reductase 


2.1e-69 


241.7 


1262 


G_glu_transp 
ept 


Gamma-glutamyltranspeptidase 


1.8e-110 


380.4 


1263 


PAS 


PAS domain 


1.3e-08 


'36.9 


1265 


LRR 


Leucine Rich Repeat 


4 .2e-22 


86.9 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p -value 


PFAM 
SCORE 


1266 


SCP 


SCP-like extracellular protein 


6e-29 


108.0 


1267 


K_tetra 


K+ channel tetramerisation 
domain 


2.8e-27 


104.0 


1269 


ras 


Ras family 


1.3e-85 


297.9 


1275 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


4 .2e-10 


37.0 


1276 


abhydrolase 


alpha/beta hydrolase fold 


5.4e-23 


89.8 


1277 


abhydrolase 


alpha/beta hydrolase fold 


5.6e«21 


83.1 


1279 


trypsin 


Trypsin 


4.4e-41 


132.0 


1280 


PBP 


Phosphatidylethanolamine- 
binding protein 


1.3e-13 


58.7 


1285 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


5.6e-14 


49.6 


1287 


ank 


Ank repeat 


1.7e-52 


1B7.8 


1294 


£n3 


Fibronectin type III domain 


0.026 


20.9 


1295 


GBP 


Guanylate-binding protein 


0.00026 


-70.0 


1296 


PMP22_Claudi 
n 


PMP-22/EMP/MP20/ciaudin family 


6.9e-41 


149.3 


1297 


Rhodanese 


Rhodanese -like domain 


3 ,2e-14 


60 .7 


1298 


LIM 


LIM domain containing proteins 


5.8e-2l 


79.1 


1301 


rnaseA 


Pancreatic ribonucleases 


4.9e-43 


145.2 


1307 


mito^carr 


Mitochondrial carrier proteins 


2.1e-53 


186.0 


1308 


WD4 0 


WD domain, G-beta repeat 


1.6e-17 


71.6 


1310 


UPAR LY6 


u-PAR/Ly-6 domain 


7.1e-20 


75.5 


1313 


thiored 


Thioredoxin 


3 .6e-05 


21.6 


1314 


Aa_ trans 


Transmembrane amino acid 
transporter protein 


1.5e-67 


237.9 


1316 


trypsin 


Trypsin 


4.4e-41 


132.0 


1320 


Ribosomal LI 
3 


Ribosomal protein L13 


3.9e-62 


219.8 


1327 


Armadillo_se 
9 


Armadillo/beta- cat enin- like 
repeats 


0.0054 


23.4 


1328 


KRAB 


KRAB bOX 


0.052 


-5.6 


1329 


rrm 


RNA recognition motif. 


2.1e-40 


147.7 


1330 


Bcl-2 


Apoptoais regulator proteins, 
Bcl-2 family 


0.014 


-1.6 


1331 


PX 


PX domain 


2 .le-10 


48.0 


1333 


KRAB 


KRAB box 


1.8e-36 


134. £ 


1334 


UPP_syntheta 
se 


Putative undecaprenyl 
diphosphate synt 


2.3e-B9 


310.3 


1335 


UPP_syntheta 
se 


Putative undecaprenyl . 
diphosphate synt 


1.8e-59 


211.0 


1336 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


1.2e-31 


118.6 


1337 


DSPc 


Dual specificity phosphatase, 
catalytic doma 


2 .3e-12 


54.5 


1338 


TPR 


T?R Domain 


0.00021 i 


28.1 


1340 


metal thio 


Metal lothionein 


0 .013 


20.3 


1341 


mutT 


Bacterial mutT protein 


5. 8e-09 


36.5 


1343 


Band 41 


FERM domain (Band 4.1 family) 


1.3e-38 


122.5 


1344 


Kelch 


Kelch motif 


1.4e-44 


161.5 


1345 


Antifreeze 


Antifreeze protein 


1.2e-10 


48.8 


1347 


3feeta_HSD 


3 -beta hydroxys tero id 
dehydrogenase/ isomer a 


0.086 


-177.2 


1348 


BTB 


BTB/POZ domain 


5.3e-28 


106.5 


1349 


DUF6 


Integral membrane protein DUF6 


0.033 


15.8 


1350 


myosin_head 


Myosin head (motor domain) 


0 


1088.7 


1352 


Nramp 


Natural resistance-associated 
macrophage pro 


1.2e-202 




1353 


S_100 


S-100/ICaBP type calcium 
binding domain 


5.3e-23 


89.9 


1355 


DEAD 


DEAD/DEAH box helicase 


3.^e-65 


209.0 


1356 


C2 


C2 domain 


2.4e-15 


64.4 


1357 


RBD 


Raf-like Ras -binding domain 


4 .2e-57 


203 .1 


1360 


zt-C2H2 


Zinc finger, C2H2 type 


7.4e-141 


481.4 


1361 


HMG14_17 [ 


HMG14 and HMG17 


7.9e-40 


145.7 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


13*Ji 


SIS 


SIS domain 


3 .8e-30 


113 .6 


1363 


SIS 


SIS domain 


1.3e-28 


108.5 


1364 




Immunoglobulin domain 


0.00026 


19.0 


1368 


K_tetra 


K+ channel tetramerisation 
domain 


l.le-16 


eis.s 


1371 


Collagen 


Collagen triple helix repeat 
(20 copies) 


2.2e-U3 


390.1 


1372 


DnaJ 


DnaJ domain 


6.6e-36 


132 .7 


1376 


KRAB 


KRAB box 


2.1e-38 


141.0 


1378 


ELM2 


ELM2 domain 


2e-23 


91.3 


1380 


thiored 


Thioredoxin 


1.2e-23 


82.8 


1381 


ank 


Ank repeat 


2.3e-83 


290.4 


1382 


BTB 


BTB/POZ domain 


3e-ll 


50.8 


1383 


WD40 


WD domain, G-beta repeat 


1.6e-19 


78.3 


1384 


WD40 


WD domain, G-be ta repeat 


6.3e-24 


92.9 


1387 


zf-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


l.le-09 


35.6 


1389 


Zf-C2H2 


Zinc finger, C2H2 type 


5.5e-50 


179.5 


1390 


z£-C2H2 


Zinc finger, C2H2 type 


2 .5e-85 


296 .9 


1393 


kinesin 


Kinesin motor domain 


7.8e-188 


637.4 


1394 


zf-C2H2 


Zinc finger, C2H2 type 


1.2e-49 


178.4 


1398 


KRAB 


KRAB box 


5.1e-22 


8^ 


1402 


bZIP 


bZIP transcription factor 


0.035 


13.1 


1405 


sugar_tr 


Sugar (and other) transporter 


0.003 


-101.5 


1406 


RhoGAP 


RhoGAP domain 


8 .9e-47 


168.8 


1407 


rrm 


RNA recognition motif. 


le-35 


132.1 


1408- 


LRR 


Leucine Rich Repeat 


2.1e-13 


58.0 


1409 


Nebulin_repe 
at 


Nebulin repeat 


6e-54 


192.6 


1410 


ank 


Ank repeat 


1.6e-17 


71.6 ■ 


1412 


Ribosomal_L5 
_C 


ribosomal L5P family C- terminus 


8.2e-58 


205.5 


1415 


trypsin 


Trypsin 


4.7e-85 


270.4 


1416 


aminotran l 


Aminotransferases class- I 


4 .4e-05 


-91.2 


1417 


SI 


SI RNA binding domain 


1.66-07 


33.1 


1419 


WD4 0 


WD domain, G-beta repeat 


2.2e-09 


44. £ 


1422 


cadherin 


Cadherin domain 


8.3e-42 


152.3 


1424 


SH3 


SH3 domain 


2.5e-80 


280.3 


1425 


PHD 


PHD-finger 


3 ,2e-17 


70.6 


1426 


PHD 


PHD- finger 


3 .2e-17 


70.6 


1427 


ArxGap 


Putative GTP-ase activating 
protein for Arf 


le-37 


138.8 


1428 


helicase_C 


Helicases conserved C- terminal 
domain 


le-26 


102.2 


1429 


WD4 0 


WD domain, G-beta repeat 


3 .9e-07 


37.2 


1430 


inositol_P 


Inositol monophosphatase family 


2.5e-10 


40.2 


1431 


mito carr 


Mitochondrial carrier proteins 


4 .3e-83 


287.7 


1433 | 


Clq 


Clq domain 


2 .9e-16 


66.2 


1434 


WD4 0 


WD domain, G-beta repeat 


1.6e-13 


58.3 


1435 


Inos-l- - 
P_synth 


Myo- inos i tol - 1 -phospha t e 
synthase 


7e-228 


770.4 


1436 


rriii 


RNA recognition motif. 


1.4e-34 


128.3 


1438 


ig 


Immunoglobulin domain 


1.3e-12 


45.6 


1440 


G_Adapt_CT 


Gamma -adapt in, C- terminus 


3 .4e-67 


236.7 


1441 


G_Adapt_CT 


Gamma -adapt in, C- terminus 


3 .4e-67 


234.7 


1443 


Kelch 


Kelch motif 


0 .00013 


28.7 


1446 


ARID 


ARID DNA binding domain 


1 . 8e-2l 


84 .7 


1447 


zf-C2H2 


Zinc finger, C2H2 type 


9 .4e-28 


105.6 


1448 


AMP-bindlng 


AMP-binding enzyme 


2.6e-07 


-145.1 


1451 


rrm 


RNA recognition motif. 


6.5e-2l 


82 .9 | 


1454 


ig 


Immunoglobulin domain 


5.6e-44 


146.7 


1455 


Sialyltransf 


Sialyltransferase family 


5.4e-2l 


83 .2 


1460 


Aldose_epim 


Aldose 1-epimerase 


1.9e-35 


131.2 


1461 


C2 


C2 domain 


4e-18 


73 .6 


1470 


TIG 


IPT/TIG domain 


3.1e-19 


77.3 


1472 


PseudoU_synt 


RNA pseudouridylate synthase 


4.3e-16 


66.9 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 




h_2 








1474 


DENN 


DENN (AEX-3) domain 


1.3e-44 


161.6 


1475 


Cation_ef flu 

X 


Cation efflux family 


4.6e-49 


176.4 


1477 


TBC 


TBC domain 


8e-47 


169.0 


1478 


rrm 


RNA recognition motif. 


2e-21 


84.6 


1480 


ig 


Immunoglobulin domain 


5.5e-06 


24 .3 


1484 


Telo_bind_al 
pha 


Telomere-binding protein alpha 
subuni 


0.028 


-225.9 


1485 


zf-C2H2 


Zinc finger, C2H2 type 


l.Be-48 


240.9 


1486 


pkinase 


Eukaryotic protein kinase 
domain 


9.5e-13 


49.9 


1488 


helicase_C 


Helicases conserved C- terminal 
domain 


l.4e-l5 


65.2 


1489 


DUF89 


Protein of unknown function 
DUF89 


0.079 


-132.4 


1490 


ECH 


Enoyl-CoA hydra tase/ isomer as e 
family 


5.2e-41 


149.7 


1491 


guanylate_cy 
c 


Adenylate and Guanylate cyclase 
catalyt 


5.9e-46 


166.1 


1492 


LRR 


Leucine Rich Repeat 


3.4e-19 


77.2 


149* 


ZI-C3HC4 


Zinc finger, C3HC4 type (RING 
finger) 


7. le-10 


36.3 


1497 


pkinase 


Eukaryotic protein kinase 
domain 


le-22 


85.8 


1500 


SH3 


SH3 domain 


9.3e-05 


27.2 


1502 


homeobox 


Homeobox domain 


0.084 


13 .8 


1503 


homeobox 


Homeobox domain 


0.084 


13. 8 


1505 


EGF 


EGF -like domain 


2.7e-23 


90 . 8 


1506 


UCH-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


2.7e-21 


84.2 


1508 


Peptidase M2 
0 


Peptidase family M20/M25/M40 


2.8e-28 


101.8 


1511 


PX 


PX domain 


1.9e-ll 


51.5 


1512 


Sulf atase 


Sulfatase 


2.8e-35 


130.7 


1516 


Syntaxin 


Syntaxin 


0.011 


-62 .3 


1518 


aminotran_3 


Aminotransferases class- III 
pyridoxal -pho 


9.7e-106 


305.6 


1520 


ig 


Immunoglobulin domain 


0 .075 


11.0 


1521 


RA 


Ras association (RalGDS/AF-6 ) 
domain 


0.013 


13 .3 


1523 


RhoGAP 


RhoGAP domain 


2.5e-05 


18.7 


1528 


WD40 


WD domain, G-beta repeat 


5.4e-24 


93.1 


1535 


IMS 


impB/mucB/samB family 


7.8e-95 


328.5 


1538 


FYVE 


FYVE zinc finger 


3.2e-27 


101.5 


1539 


DAGKC 


Diacylglycerol kinase catalytic 
domain 


6e-07 


36.5 


1540 


Ocular alb 


Ocular albinism type 1 protein 


0 


1184.7 


1653 


SAP 


SAP domain 


6e-06 


33.2 


1654 


Amino_oxldas 
e 


Flavin containing amine oxidase 


"T.2e-4S 


157.0 


16*5 


Amino_oxidas 
e 


Flavin containing amine oxidase 


3.2e-43 


157.0 


1656 


RhoGEF 


RhoGEF domain 


1.4e-24 


95.1 


1657 


MMR HSR1 


GTPase of unknown function 


0.0011 


-45.5 


16*9 


dcri-2 


Ubiquitin carboxyl- terminal 
hydrolase family 


2.5e-ll 


51.1 


1660 


actin 


Actin 


6.6e-21 


69.9 


1661 


BAH 


BAH domain 


1.7e-82 


287.5 


1662 


vwa 


von Willebrand factor type A 
domain 


0 


1909.4 


1663 


WD40 


WD domain, G-beta repeat 


1.4e-67 


237.9 


1667 


zt-C2H2 


Zinc finger, C2H2 type 


1.3e-93 


324 .4 


1669 


NOll_Nop2_Su 
n 


NOLl/NOP2/sun family 


1.3e-23 


84.3 


1671 


SH2 


Src homology domain 2 


5.4e-l5 


46.9 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORS 


1672 


cnromo 


♦chromo' (CHRromatin 
Organization Modifier) 


2.1e-18 


67.7 


1674 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


0.0025 


17 . 6 


1676 


Glyco hydro 
47 


Glycosyl hydrolase family 4 7 


1.8e-187 


636.2 


1677 


Glyco_hydrq_ 
4 7 


Glycosyl hydrolase family 4 7 


4 . 5e-74 


259.5 


1680 


WD40 


WD domain, G-beta repeat 


l.le-27 


105.5 


1681 


WD40 


WD domain, G-beta repeat 


l.le-27 


105.5 


1683 


MMR_HSR1 


GTPase of unknown function 


1.8e-78 


274.1 


1691 


rrm 


RNA recognition motif. 


1.8e-37 


137.9 


1692 


rrm 


RNA recognition motif. 


1.8e-37 


137.9 


1693 


AAA 


ATPases associated with various 
cellular act 


1.3e-81 


284 .5 


1697 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


8 .4e-82 


285.2 " 


169B 


Ferric_reduc 
t 


Ferric reductase like 
transmembrane com 


3.5e-53 


190.1 


1699 


zf -C2H2 


Zinc finger, C2H2 type 


4 .4e-34 


126.6 


1700 


arf 


ADP-ribosylation factor family 


9e-19 


75.8 


1702 


GTP_EFTU 


Elongation factor Tu family 


0.014 


11.4 


1703 


SCAN 


SCAN domain 


1.8e-54 


194.4 


1707 


pkinaae 


Eukaryotic protein kinase 
domain 


1.2e-88 


307.9 


1709 


WD4 0 


WD domain, G-beta repeat 


0.0035 


24 .0 


1710 


LRR 


Leucine Rich Repeat 


1.2e-30 


115.3 


1711 


WW 


WW domain 


7.6e-12 


52.8 


1712 


ank 


Ank repeat 


4.2e-34 


126.7 


1713 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3-H 
type 


2.6e-09 


38.3 


1714 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3 -H 
type 


2.6e-09 


38.3 


1715 


ras 


Ras family 


4 .4e-41 


149. 9 


1718 


HMG_box 


HMG (high mobility group) box 


8 ,3e-2l 


82 .6 


1719 


TBC 


TBC domain 


l.le-45 


165.2 


1721 


HLH 


Helix-loop-helix DNA-binding 
domain 


9.2e-10 


45.9 


1723 


dBrm 


Double-stranded RNA binding 
motif 


2.9e-05 


30.9 


1724 


RrnaAD 


Ribosornal RNA adenine 
dimethylases 


0.045 


9.5 


1725 


CIDE-N 


CIDE-N domain 


5.9e-40 


146.2 


1726 


HAT 


HAT (Half -A-TPR) repeats 


2.9e-44 


160.5 


1728 


efhand 


EF hand 


5.1e-20 


79.9 


1733 


Hist_deacety 
1 


Histone deacetylase ramily 


1 .7e-l04 


360.6 


1735 


LRR 


Leucine Rich Repeat 


4 ,6e-34 


126.. 6 


1739 


PI-PLC-X 


Phosphatidylinositol-specific 
phosphol ipase 


0.0023 


16.1 


1743 


ras 


Ras family 


3.7e-10 


-21.3 


1744 


ras 


Ras family 


3.7e-10 


-21.3 


1745 


RasGEF 


RasGEF domain 


3.2e-49 


176.9 


174 6 


adh_short 


short chain dehydrogenase 


7.1e-08 


34.6 


1751 


zf -C2H2 


Zinc finger, C2H2 type 


9e-39 


142.2 


1754 


fn3 


Fibronectin type III domain 


5.5e-101 


348.9 


1756 


zf-C2H2 


Zinc finger, C2H2 type 


6.3e-93 


322.1 


1758 




RNA recognition motif. 


0.017 


21 . 2 


1760 


Nop 


Putative snoRNA binding domain 


6.1e-95 


328.8 


1761 


Nop 


Putative snoRNA binding domain 


6 .le-95 


328.8 


1765 


MMR HSR1 


GTPase of unknown function 


6 .4e-41 


149.4 


1769 


CNJnydrolase 


Carbon -nitrogen hydrolase 


3e-06 


-43 .9 


1775 


ank 


Ank repeat 


4.1e-07 


37.1 


1779 


Oxysterol BP 


Oxysterol-binding protein 


4 ,7e-56 


199.6 


1783 


RhoGEtF 


RhoGEF domain j" 


1.6e-23 


91.6 


1784 


RhoGBF 


RhoGEF domain 


1.6e-23 


91.6 
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SEQ ID 
NO: 


PFAM NAME 


DESCRIPTION 


p-value 


PFAM 
SCORE 


1785 


frm 


RNA recognition motif. 


6.4e-14 


59.7 



TRADOCS: 1 4 1 6227. 1 (%CRN0 1 1.DOC) 
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TABLE 5 



SEQ ID NO; 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


Means (MEAN 
SCORE) 


1 


1-21 


0.991 


0.955 


2 


1-31 


0 . 995 


0.944 ! 


3 


1-33 


0 - 949 


0.736 


4 


1-19 


0.970 


0.951 


5 


1-26 


0.971 


0.863 


6 


1-26 


0.971 


0.863 


7 


1-26 


0.971 


0.863 


8 


1-26 


0.971 


0.863 


9 


1-46 


0.982 


0.901 


10 


1-21 


0.991 


0.955 


11 


1-23 


0.989 


0.899 j 


12 


1-25 


0.955 


0.803 


13 


1-18 


0.932 


0.625 


14 


1-18 


0.938 


0.876 


15 


1-25 


0.941 


0.811 


16 


1-17 


0.972 


0.939 


17 


1-27 


0. 964 


0.777 


18 


1-16 


0.914 


0.657 


19 


1-19 


0.953 


0.840 


20 


1-20 


0.935 


0.701 


21 


1-22 


0,974 


0.850 


22 


1-33 


0.961 


0.895 


23 


1-19 


0.991 


0.959 


24 


1-31 


0.995 


0.944 


25 


1-22 


0.976 


0.935 


26 


1-27 


0.996 


0.928 


27 


1-24 


0.953 


0.739 


28 


1-21 


0.906 


0.688 


29 


1-31 


0.986 


0.841 


30 


1-28 


0.980 


0.893 


31 


1-19 


0.993 


0.976 


32 


1-22 


0.998 


0 .909 


35 


1-33 


0.949 


0.736 


36 


1-33 


0.949 


0.736 


46 


1-19 


0.970 


0 .951 


67 


1-25 


0.968 


0.848 


71 


1-18 


0.949 


0.845 


72 


1-30 


0.991 


0.919 


75 


1-29 


0.958 


0.854 


88 


1-20 


0.986 


0.945 


94 


1-33 


0.994 


0.943 


97 


1-46 


0.964 


0.595 


103 


1-49 


0.983 


0.570 


108 


1-26 


0.978 


0.885 


111 


1-23 


0.989 


0.899 1 


126 


1-25 


0.955 


0.803 


129 


1-19 


6.96*3 


0.918 


138 


1-29 


0.971 


0.844 


143 


1-18 


0.914 


0.628 


148 


1-20 


0.969 


0.904 


156 


1-25 


0.941 


0.811 


158 


1-22 


0.979 


0.927 


160 


1-17 


0.972 


0.939 


161 


1-48 


0.903 


0.571 


162 


1-25 


0.937 


0.729 


1<J8 


1-16 


0.939 


0.826 


171 


1-27 


0.964 


0.777 


178 


1-21 


0.945 


0. B25 


180 


1-27 


0.981 


0.941 


187 . 


1-28 


0.982 


0. 936 


190 


1-19 


0.953 


0.840 


" 196 


1-22 


0.975 


0.916 


197 


1-22 


0.963 J 0.936 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


199 


1-20 


0.935 


0.701 


200 


1-23 


0 .977 


0 .773 


206 


1-30 


0.984 


0.890 


207 


1-19 


0.990 


0.924 


208 


1-22 


0.974 


0.850 


210 


1-40 


0.940 


0.670 


211 


1-28 


0.971 


0.849 


216 


1-24 


0.986 


0.956 


218 


1-33 


0.961 


0.895 


219 


1-19 


0.970 


0.871 


221 


1-19 


0.904 


0.553 


222 


1-21 


0.917 


0.555 


230 


1-19 


0.991 


0.959 


231 


1-26 


0.953 


0.800 


232 


1-25 


0.988 


0.826 


239 


1-23 


0.969 


0.628 


240 


1-17 


0 .982 


0.955 


241 


1-17 


0.982 


0.955 


245 


1-30 


0 .970 


0.722 


248 


1-22 


0.976 


0.935 


249 


1-23 


0.968 


0.94 0 


252 


1-18 


0.971 


0.923 


261 


1-24 


0.883 


0.587 ; 


265 


1-18 


0.939 


0.66B 


272 


1-24 


0.953 


0 .739 


283 


1-21 


0.906 


0.688 


284 


1-29 


0.997 


0.854 


290 


1-31 


0.986 


0.841 


302 


1-28 


0.980 


0.893 


304 


1-16 


0.907 


0.635 


312 


1-19 


0.993 


0.976 


313 


1-17 


0.930 


0.753 


323 


1-22 


0.998 


0.909 


324 


1-17 


0.982 


0.954 


328 


1-19 


0.971 


0.865 


329 


1-22 


0.963 


0.924 


330 


1-33 


0.978 


0 .841 


331 


1-24 


0.920 


0.712 


332 


1-24 


0.975 


0.881 


333 


.1-19 


0.984 


0.941 


334 


1-20 


0.899 


0.567 


335 


1-27 


0.942 


0.813 


336 


1-20 


0 .952 


0.850 


337 


1-38 


0.942 


0.653 


338 


1-27 


0.973 


0 .772 


339 


1-36 


0.979 


0.804 


340 


1-27 


0.888 


0.597 


343 


1-19 


0.971 


0.865 


344 


1-22 


0.994 


0.928 


345 


1-17 


0.966 


0.687 


346 


1-19 


0.936 


0.822 


347 


1-22 


0.963 


0.924 


349 


1-24 


0.982 


0.966 


351 


1-21 


0.918 


0.815 


352 


1-31 


0.988 


0.912 


354 


1-31 


0.974 


0.839 


355 


1-29 


0.932 


0.632 


356 


1-15 


0.994 


0.969 


357 


1-33 


0.935 


0 .726 


360 


1-27 


0.938 


0.827 


361 


1-25 


0,954 


0.674 


362 


1-22 


0.929 


0 .788 


363 


1-21 


0. 881 


0.715 


364 


1-33 


0.978 


0.841 


365 


1-33 


0.978 


0.841 
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SEQ ID NO: 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 


366 


1-21 


0.916 


0.820 


3^7 


1-19 


0.936 


0.822 


368 


1-29 


0.972 


0.874 


370 


1-24 


0.920 


0.712 


372 


1-24 


0.961 


0.773 


372 


1-27 


0.919 


0.768 


373 


1-19 


0.986 


0.945 


375 


1-32 


0.994 


0 .932 


376 


1-34 


0.987 


0.810 


377 


1-17 


0.995 


0.950 


378 


1-49 


0.971 


0.749 


380 


1-20 


0.968 


0.874 


381 


1-20 


0.928 


0.782 


382 


1-19 


0.986 


0.934 


383 


1-28 


0.965 


0.829 


384 


1-39 


0.970 


0.551 


386 


1-24 


0.975 


0. 881 


388 


1-30 


0.989 


0.868 


389 


1-19 


0.9B4 


0.941 


390 


1-26 


0.971 


0.782 


392 


1-20 


0.981 


0.900 


393 


1-16 


0.968 


0.890 


394 


1-23 


0.937 


0.701 


397 


1-22 


0.985 


0.854 


399 


1-46 


0.977 


0.698 


401 


1-20 


0.899 


0.567 


402 


1-22 


0.967 


0.931 


403 


1-27 


0.992 


0.934 


404 


1-19 


0.991 


0.973 


405 


1-23 


0.994 


0.921 


407 


1-35 


6.987 


0.658 


408 


1-39 


0.976 


0.551 


409 


1-33 


0.897 


0.570 


410 


1-25 


0.990 


0.962 


411 


1-38 


0.977 


0.827 


412 


1-20 


0.944 


0.768 


413 


1-20 


0. 988 


0.965 


414 


1-46 


0.993 


0.638 


415 


1-23 


0.981 


0.940 


417 


1-29 


0.941 


0.672 


418 


1-20 


0.952 


0.850 


419 


1-19 


0.986 


0.967 


420 


1-29 


0.965 


0.861 


421 


1-22 


0. 889 


0.785 


422 


1-48 


0.982 


0.862 


424 


1-19 


0.979 


0.933 


428 


1-38 


0.942 


0.653 


430 


1-18 


0.947 


0.595 


432 


1-33 


0.957 


0.789 


433 


1-26 


0.979 


0.904 


434 


1-27 


0.962 


0.777 


435 


1-24 


0.998 


0.977 


436 


1-27 


0.973 


0.772 


443 


1-15 


0.966 


0.940 


448 


1-36 


0.979 


0.804 


453 


1-41 


0.958 


0.609 


455 


1-33 


0.943 


0.606 


457 


1-27 


0.888 


0.597 


462 


1-16 


0.925 


0.681 


486 


1-27 


0 .972 


0.845 


495 


1-24 


0.917 


0.636 


498 


1-26 


0.993 


0.890 


505 


1-20 


0 .976 


0.926 


507 


1-17 


0.966 


0.687 


510 


1-23 


0.930 


0.593 
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SEQ ID NO: 


POSITION OF 


MaxS (MAXIMUM 


MeanS (MEAN 




SIGNAL IN AMINO 


SCORE) 


SCORE) 




ACID SEQUENCE 






511 


1-23 


0.930 


0.593 


512 


1-23 


0,930 


0.593 


515 


1-18 


0.978 


0.956 


523 


1-19 


0.936 


0.822 


529 


1-22 


0.963 


0. 924 


545 


1-24 


0.982 


0.966 


550 


1-30 


0.933 


0.713 


552 


1-21 


0 .973 


0.S12 


554 


1-23 


0.969 


0.784 


571 


1-21 


0.918 


0.815 


574 


1-31 


0.988 


0.912 


580 


1-39 


0.925 


0.556 


594 


1-31 


0.974 


0.839 


608 


1-29 


0.932 


0.632 


609 


1-29 


0.932 


0.632 


610 


1-21 


0.990 


0 .948 


621 


1-15 


0.994 


0.969 


623 


1-33 


0.935 


0.726 


653 


1-27 


0.938 


0 .827 


668 


1-22 


0.929 


0.788 


677 


1-16 


0 .948 


0 . 807 


685 


1-21 


0.881 


0.715 


699 


1-22 


0.975 


0.816 


702 


1-31 


0.968 


0.898 


707 


1-16 


0.880 


0.562 


713 


1-25 


0.966 


0.743 


718 


1-19 


0.936 


0.822 


719 


1-20 


0 .961 


0 . 824 


729 


1-29 


0.972 


0.874 


735 


1-46 


0.903 


0.598 


746 


1-14 


0.916 


0.730 


747 


1-22 


0.965 


0.876 


748 


1-29 


0.968 


0.785 


759 


1-24 


0.961 


0.773 


767 


1-27 


0.919 


0.768 


768 


1-33 


0.900 


0.585 


773 


1-42 


0.959 


0.702 


779 


1-19 


0.986 


0.945 


797 


1-19 


0.944 


0.759 


798 


1-19 


0 .900 


0.568 


820 


1-17 


0.995 


0.950 


827 


1-49 


0.971 


0.749 


848 


1-20 


0.968 


0.874 


864 


1-20 


0.928 


0.782 


866 


1-19 


0.986 


0.934 


873 


1-23 


0.948 


0.886 


881 


1-28 


0 .965 


0.829 


887 


1-39 


0.970 


0.551 


927 


1-30 


0.989 


0.868 


934 


1-48 


0.988 


0. 777 


939 


1-39 


0.994 


0. 889 


944 


1-26 


0.971 


0.782 


950 


1-29 


0.957 


0.845 


963 


1-20 


0.981 


0.900 


964 


1-20 


0.886 


0.558 


973 


1-16 


0.968 


0. 890 


980 


1-34 


"0.961 


0.749 


981 


1-20 • 


0.953 


0.822 


984 


1-12 


0. 938 


0 . 780 


1015 


1-22 


0.985 


0 . 854 


1040 


1-46 


0.977 


0.698 


1052 


1-18 


0.969 


0 . 842 


1059 


1-20 T 


0.927 1 


0.867 


1065 


1-33 


0 .983 


0.918 


1069 


1-22 


0.993 


0.935 
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SEQ ID NO ; 


POSITION OF 
SIGNAL IN AMINO 
ACID SEQUENCE 


MaxS (MAXIMUM 
SCORE) 


MeanS (MEAN 
SCORE) 




1-27 


0 .992 


0 . 934 


1080 


1-19 

— - — — 


0.931 


0 . 8i9 




1-19 


0 . 991 


0 .973 


1094 


1-46 


0.992 


0 .653 


1095 


1-30 


0 . 974 


0 .929 


XiUj 


1 - 23 


0 .994 


0 .921 


1123 




0 .987 


0 . 658 


113 8 


1-32 


0 . 954 


0 . 613 


11 / ft 


1 - 3 B 


0 .989 


0 .789 


114 2 


1-33 


0.897 


0 . 570 




1-25 


0 .990 


0.962 


1170 


1-38 


0 . 977 


0.027 


1176 


1-20 


0 . 944 


0 . 768 


1187 


1-20 


0.988 


0 . 965 


1189 


X- 3 b 


0 . 967 


0 . 839 


1192 


i : ac — — 


0 . 993 


0 .638 


1193 


1 - lb 


0 . 925 


0.710 


1 1 Q7 


1-29 


0 . 985 


0.853 


IaUO 


1 - 23 


0 . 981 


0.940 


"1 "?7t^ 


1-29 


0 .941 


0.672 




1-19 


0.986 


0.967 


1 oco 
IaSO 


1-29 


0 . 965 


0 . 861 


1 9cc 


1-22 


0 . 889 


0.785 


IfiDD 


1-20 


0 .944 


0.809 


1<S / O 


1-48 


0 .982 


0 .862 


1 9 Q7 
1<G 


1-19 


0 .979 


0.933 


1& JO 


1-21 


0 . 984 


0.944 


x^y t 


1-19 


0 .984 


0.953 


1 ITT 
1JJ<( 


1-3 8 


0 .942 


0.653 


13 58 


1-18 


0.947 


0.595 


ij / i 


1-33 


0 .957 


0.789 


IjOU 


1-26 


0 .979 


0.904 


Ij?/ 


1-27 


0 . 962 


0.777 




1-23 


0.997 


0.960 


14 04 


1-24 


0.998 


0.977 


i 41 n 

l<xlU 


1-15 


0. 946 


0.84^ 




1-24 ■ 


0.913 


0.588 


1 a i tr 

1413 


1-19 


0.982 


0.929 


1 4 1 £ 

l*i ID 


1-12 


0. 931 


0.891 


vZT5 


1-30 


0.933 


0.563 


1 4 9 rS 
1** z u 


1-20 


0 . 881 


0.561 


1421 


1-19 


0 . 990 


0.968 


1423 


1-17 


0 , 968 


0.863 


1424 


1-21 


0 . 885 


0.591 ~~ 1 


1425 


i — z. 1 * 


0 . 913 


0 . 588 


1426 


1-24 


0 . 913 


0 .588 


1428 


1-25 


0 .967 


0.899 


1430 


1-34 


0.977 \ 


0.819 


1431 


1-28 


0 . 979 


0.923 


1432 


l-3o 


0 . 957 


0.613 


1433 


1-32 


0 . 921 


0.753 


1434 


1-39 


0 . 983 


0.621 


l*t J3 


1-25 


0 .910 


0.631 


1 * Jo 


1-42 


0 .988 


0.868 


1 4 7 
i*« J / 


1-22 


0 .998 


0.980 


J.£4^ 


1-20 


0.918 


0 . 753 


1448 


1-12 


— — _ 

u . y j l 


0 . 891 


1462 


1-18 


0.968 


0.888 


1490 


1-20 


0.881 


0.561 


1518 ^ 


1-17 


0.968 


0 . 863 


1525 


1-21 


0.885 


0.5^1 


1547 


1-28 


0.974 


0.891 


1561 


1-25 


0.967 


0.899 


1580 


1-17 


0.923 


0.824 1 


1593 


1-28 


0.979 


"0.923 
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SBQ ID NO: 


ir\JO J. 1 ± \Jrl V e 


MaxS ( MAX I MUM 


MeanS (MEAN 
SCORE) 


1596 


1 - 16 


u . j z y 


0 . 709 


1601 


1-36 


U . 33 / 


0.613 


1606 


1-22 


0 . 979 


0 . 831 


1607 


1-20 




0 . 770 


1608 




U . 3 Z J. 


0.753 


1614 


1-33 


0.969 


0 . 829 


1616 


1-20 


0.959 


0 .869 


1625 


1-39 


0.983 


0.621 


1632 


1-25 


0.910 


0.*31 


1636 


1-33 


0 . 897 


0.591 


16"34 


1-42 


0.988 


0.868 


1645 


1-20 


0.927 


0.568 


1647 


1-17 


0.923 


0.742 


1648 


1-22 


0.998 


0.980 



TRADOCS: 141 6234. 1 (%CR%01 ! .DOC) 
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TABLE 6 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO : of 


of contig 


NO: 


docket number__ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleo t ide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


Sequence 


peptide 




sequence 


priority 












a.ppl ica t ion 






1787 


J D / O 


^ "2 C Q 


tO*x\*Xir4 1 


1103 


2 


1 / 0 O 


1 tin a 




/ o4L,lF^ Z 


2673 


*j 
J 


l / oy 


■JCTC 


DJ b 1 


T O A r~*T TS*^ O 

/o4CIP2 3 


4117 


4 


i 7on 

X / 7U 


J D / a 


3J b Z 


"7 P A r*T D"3 A 

/04L.IF45 4 


5556 


c 


1 7Q1 


J D / / 




H 0 A 0101 C 

/o4CIP2 5 


5562 


D 


T "7 QO 


3 578 


53 64 


784CIP2 6 


5562 


7 
/ 


1 / 3J 


j b /y 


aJbb 


H O A PTm <"I 

784CIP2_7 


5562 


Q 

a 


1/74 


"icon 




/o4L±P*S 8 


5562 


Q 


i / y t> 


3581 


536 7 


784CIP2 9 


5563 


10 


1796 


3 582 


5368 


784CIP2 10 


5564 


11 


1797 


3 583 


5369 


784CIP2 11 


5565 


12 


1798 


3584 


5370 


784CIP2 12 


5689 


13 


1799 


3585 


5371 


784CIP2_13 


5729 


14 


1800 


3 586 


5372 


784CIP2_14 


5745 


15 


1801 


3587 


5373 


784CIP2 15 


5777 


lb 


1802 


3588 


5374 


784CIP2 16 


5777 


17 


1803 


3589 


5375 


784CIP2_17 


5789 


18 


1804 


3590 


5376 


784CIP2_18 


5792 


19 


1805 


3591 


5377 


784CIP2_19 


5804 


20 


1806 


3592 


5378 


784CIP2_20 


5805 


21 


1807 


3593 


5379 


784CIP2 21 


5805 





1808 


3594 


5380 


784CIP2_22 


[ 5844 


23 


1809 


3595 


5381 


784CIP2 23 


5844 


") A 
<fi4 


1 810 


3596 


5382 


784CIP2 24 


5850 




1811 


3597 


53 83 


784CIP2 25 


5867 


2 b 


1812 


3598 


5384 


784CIP2_26 


5973 


Z / 


1813 


3599 


5385 


784CIP2_27 


5995 


28 


1814 


3600 


5386 


784CIP2_28 


5995 


To — i 


1815 


3 601 


5387 


784CIP2 29 


6005 


30 


1816 


3602 


5388 


784CIP2_30 


6007 


31 


1817 ' 


3603 


5389 


7B4CIP2_31 


6007 ; 


32 


1818 


3604 


5390 


784CIP2_32 


6009 


33 


1819 


3605 


5391 


784CIP2 33 


6012 


1 A 


1820 


3606 


5392 


7B4CIP2_34 


6015 j 


i 1 c 


1821 


3607 


5393 


784CIP2_35 


6016 


JD 


1 822 


3608 


5394 


784CIP2 36 


6016 


J / 


1823 


3609 


5395 


784CIP2_37 


6018 


1 ft 


1 Q OX 


3610 


5396 


784CIP2 38 


6018 


1 Q 


1825 


3611 


5397 


784CIP2_39 


6018 


H. u 


1826 


3 612 


5398 


784CIP2 40 


6023 


2 1 


1827 


3 613 


5399 


784CIP2 41 


6070 


4<6 


n cos 


3 614 


5400 


784CIP2 42 


6081 




T Q T Q 


3615 


5401 


784CIP2 43 


6089 


A A 


183 0 


3616 


5402 


784CIP2_44 


6118 


45 


1831 


3617 


5403 


784CIP2_45 


6118 


46 


1832 


3618 


5404 


784CIP2_46 


6130 


47 


1833 


3619 


5405 


784CIP2_47 


6177 


48 


1834 


icon 


5406 


/84C1P2_4 8 


6189 


! 49 


1835 


3621 


5407 


784CIP2_4 9 


6191 


50 


1836 


3622 


5408 


784CIP2 50 


6204 


51 


1837 


3623 


5409 


784CIP2 51 . 


6204 


52 


1838 


3624 


5410 


784CIP2_52 


6284 


53 


1839 


3625 


5411 


784CIP2 53 


6367 


54 


1840 


3626 


5412 


784CIP2_54 


6436 


55 


1841 


3627 


5413 


784CIP2 55 


6442 


56 


1842 


3628 


5414 


784CIP2_56 


6445 


57 


1843 


3629 


5415 


784CIP2 57 


6457 


58 


1844 


3630 


5416 


784CIP2_58 


6458 


59 


1845 


3631 


5417 


784CIP2 59 


6458 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority- 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


€0 


1846 


3632 


5418 


784CIP2_60 


6462 


61 


1847 


3633 


5419 


784CIP2_6l 


6472 


62 


1848 


3634 


5420 


! 784CIP2 62 


6499 


63 


1849 


3635 


5421 


784CIP2 63 


6499 


64 


1850 


3636 


5422 


784CIP2 64 


6505 


65 


1851 


3637 


5423 


764CIP2_65 


6534 


66 


1852 


3638 


5424 


784CIP2_66 


6534 


67 


1853 


' 3639 


5425 


784CIP2 67 


6540 


68 


1854 


3640 


5426 


784CIP2_68 


6550 


69 


1855 


3641 


5427 


784CIP2_69 


6550 


70 


1856 


3642 


5428 


784CIP2J70 


6592 


71 


1857 


3643 


5429 


784CIP2 71 


6645 


72 


1858 


3644 


5430 


784CIP2 72 


6671 


73 


1859 


3645 


5431 


784CIP2_73 


6763 


74 


1860 


3646 


5432 


784CIP2_74 


6763 


75 


1861 


3647 


5433 


784CIP2 75 


6786 


76 


1862 


3648 


5434 


784CIP2 76 


6824 


77 


1863 


3649 


5435 


784CIP2_77 


6830 1 


78 


1864 


3650 


5436 


784CIP2_78 


6831 


79 


1865 


3651 


5437 


784CIP2_79 


6832 


80 


1866 


3652 


5438 


784CIP2_80 


6834 


81 


1867 


3653 


5439 


784CIP2_81 


6834 


82 


1868 


3654 


5440 


784CIP2_82 


6835 


83 


1869 


3655 


5441 


784CIP2 83 


6837 


84 


1870 


3656 


5442 


784CIP2_84 


6843 


85 


1871 


3657 


5443 


784C1P2_85 


6859 


86 


1872 


3658 


5444 


784CIP2_86 


6915 


87 


1873 


3659 


5445 


784CIP2 87 


6932 


88 


1874 


3660 


5446 


784CIP2_88 


6957 


89 

•T5 


1875 


3661 


5447 


784CIP2_89 


6961 


90 


1876 


3662 


5448 


784CIP2_90 


6973 


91 


1877 


3663 


5449 


7B4CIP2 91 


6973 


92 


1878 


3664 


5450 


784dti?2 93 


7007 j 


93 


1879 


3665 


5451 


784CIP2_94 


7018 


94 


1880 


3666 


5452 


784CIP2_95 


7019 


95 


1881 


3667 


5453 


784CIP2_96 


7020 j 


96 


1882 


3668 


5454 


784CIP2__97 


7020 


97 


1883 


3669 


5455 


784CIP2 98 


7021 


98 

nr> " 


1884 


3670 


5456 


784CIP2_99 


7023 


99 


1885 


3671 


5457 


784CIP2 100 


7027 


100 


1886 


3672 


5458 


784CIP2_101 


7028 


101 


1887 


3673 


5459 


784CIP2 102 


7029 


102 


1888 


3674 


5460 


784CIP2 103 


7031 


JLU J 


1889 


3675 


5461 


784CIP2_104 


7032 


104 


1890 


3676 


5462 


784CIP2_105 


7033 


105 


1891 


3677 


5463 


784CIP2 106 


7035 


106 


1892 


3678 


5464 


784CIP2_107 


7036 


107 


1893 


3679 


5465 


784CIP2_108 


7039 


108 


1894 


3680 


5466 


784CIP2_109 


7043 


109 


1895 


3681 


5467 


784CIP2_110 


7044 


110 


1896 


3682 


5468 


784CIP2 111 


7046 


! Ill 


1897 


3683 


5469 


784CIP2_112 


7054 


112 


1898 


3684 


5470 


784CIP2_113 


7061 




1899 


3685 


5471 


784CIP2_114 


7077 


114 


1900 


3686 


5472 


784CIP2JL15 


7092 


115 


1901 


3687 


5473 


784CIP2_116 


7094 


116 


1902 


3688 


5474 


7B4CIP2_117 


7106 


117 


1903 


3689 


5475 


7B4CIP2_118 


7107 


118 


1904 


3690 


5476 


784CIP2 119 


7111 


119 


1905 


3691 


5477 


7B4CIP2_120 


7123 


120 


1906 


3692 


5478 


784CIP2 121 


7142 


121 


1907 


3693 


5479 


784CIP2_122 


7142 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priori ty 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


122 


1908 


3694 


5480 


784CIP2_123 


7154 


123 


1909 


3695 


5481 


784CIP2 124 


7160 


124 


1910 


3696 


5482 


784CIP2 125 


7169 


125 


1911 


3697 


5483 


784CIP2JL26 


7185 


126 


1912 


3698 


5484 


784CIP2 127 


7197 


127 


1913 


3699 


5485 


784CIP2_128 


7219 


128 


1914 


3700 


5486 


784CIP2_129 


7226 


129 


1915 


3701 


5487 


784CIP2_130 


7229 


13 0 


1916 


3702 


5488 


784CIP2_131 


7234 


131 


1917 


3703 


5489 


784CIP2_132 


7235 


132 


1918 


3704 


5490 


784CIP2 133 


7235 


133 


1919 


3705 


5491 


784CIP2_134 


7238 


134 


1920 


3706 


5492 


784CIP2 135 


7247 


135 


1921 


3707 


5493 


784CIP2_136 


7261 


136 


1922 


3708 


5494 


784CIP2_137 


7262 | 


137 


1923 


3709 


5495 


784CIP2_138 


7267 


138 


1924 


3710 


5496 


784CIP2_139 


7272 


139 


1925 


3711 


5497 


784CIP2JL40 


7273 


140 


1926 


3712 


5498 . 


784CIP2_141 


7282 


141 


1927 


3713 


5499 


784CIP2_142 


7288 


142 


192B 


3714 


5500 


784CIP2_143 


7291 


143 


1929 


3715 


5501 


764CIP2_144 


7293 


144 


1930 


3716 


• 5502 


784CIP2_145 


7294 


145 


1931 


3717 


5503 


784CIP2 146 


7299 


146 


1932 


3718 


5504 


784CIP2_147 


7300 


147 


1933 


3719 


5505 


784CIP2 148 


7312 


148 


! 1934 


3720 


5506 


784CIP2_149 


7313 


149 


1935 


3721 


5507 


784CIP2 150 


7315 


150 


1936 


3722 


5508 


784CIP2 151 


7318 


151 

■ 1 


1937 


3723 


5509 


784CIP2_152 


7321 


152 


1938 


3724 


5510 


784CIP2 153 


7330 


153 


1939 


3725 


5511 


784CIP2_154 


7331 


154 


1940 


3726 


5512 


784CIP2 155 


7333 


155 


1941 


3727 


5513 


784CIP2_15£ 


" 7350 


156 


1942 


3728 


5514 


784CIP2_157 


7352 


157 


1943 


3729 


5515 


784CIP2 158 


7384 


158 


1944 


3730 


5516 


784CIP2_159 


7403 


159 


1945 


3731 


5517 


784CIP2_160 


7431 


160 


1946 


3732 


5518 


784CIP2_161 


7441 


161 


1947 


3733 


5519 


784CIP2__162 


7453 


162 


1948 


3734 


5520 


784CIP2JL63 


7467 


163 


1949 


3735 


5521 


784CIP2 164 


7471 


164 


1950 


3736 


5522 


784CIP2 165 


7493 


165 


1951 


3737 


5523 


784CIP2_166 


7502 


ST?"? 

166 


1952 


3738 


5524 


784CIP2_167 


7511 


167 


1953 


3739 


5525 


784CIP2JL68 


7514 


168 


1954 


3740 


5526 


784CIP2_169 


7526 


169 


1955 


3741 


5527 


784CIP2JL70 


7541 


170 


1956 


3742 


5528 


784CIP2 171 


7570 


171 


1957 


3743 


5529 


784CIP2 172 


7578 


172 


1958 


3744 


5530 


784CIP2_173 


7583 


173 


1959 


3745 


5531 


784CIP2_174 


7592 


174 


1960 


3746 


5532 


784CIP2JL75 


7601 


175 


1961 


3747 


5533 


784CIP2_176 


7602 


176 


1962 


3748 


5534 


784CIP2 177 


7608 


177 


1963 


3749 • 


5535 


784CIP2 178 


7615 


178 


1964 


3750 


5536 


784CIP2_179 


7617 


179 


1965 


3751 


5537 


784CIP2 181 


7624 


180 


1966 


3752 


5538 


7B4CIP2 182 


7612^ 


1B1 


1967 


3753 


5539 


784CIP2_183 


7640 


182 


1968 


3754 


5540 


784CIP2 184 


7641 


183 


1969 


3755 


5541 


784CIP2 185 


7641 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


.SEQ ID 


of full- 


NO: of 


of contig 


NO: 
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NO: in 


length 


full- 


nucleotide 


of contig 
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U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




Xo*t 


1970 


3756 


554 2 


784CIP2_186 


7641 


10 j 


1971 


3757 


5543 


784CIP2_187 


7642 


X DO 


1972 


3 758 


5544 


784CIP2_188 


7649 


lo f 


1973 


3759 


5545 


784CIP2_189 


7656 


ICQ 

loo 


1974 


3760 


554 6 


784CIP2_190 


7657 


189 


1975 


3761 


5547 


784CIP2 191 


7657 


190 


1976 


3762 


5548 


784CIP2 192 


7662 


191 


1977 


3763 


5549 


784CIP2_193 


7668 


192 


1978 


3764 


5550 


784CIP2_194 


7673 


193 


1979 


3765 


5551 


784CIP2 195 


7690 


194 


1980 


3766 


5552 


784CIP2 196 


7700 


195 


1981 


3767 


5553 


784CIP2_197 


7709 


196 


1982 


3768 


5554 


784CIP2 198 


7736 


197 


1983 


3769 


5555 


784CIP2_199 


7737 


198 


1984 


3770 


5556 


784CIP2_200 


7744 


199 


1985 


3771 


5557 


784CIP2_201 


7771 


200 


1986 


3772 


5558 


784CIP2_202 


7786 


201 


1987 


3773 


5559 


784CIP2_203 


7791 


202 


1988 


3774 


5560 


784CIP2_204 


7797 


203 


1989 


3775 


5561 


784CIP2J205 


7806 


204 


1990 


3776 


5562 


784CIP2_206 


7812 


205 


1991 


3777 


5563 


784CIP2_207 


7812 


206 


1992 


3778 


5564 


784CIP2_208 


7818 i 


207 


1993 


3779 


5565 


784CIP2_209 


7822 


208 


1994 


3780 


5566 


784CIP2_210 


7827 


209 


1995 


3781 


5567 


784CIP2_211 


7830 


210 


1996 


3782 


5568 


784CIP2_212 


7835 


211 


1997 


3783 


5569 


784CIP2_214 


7840 | 


212 


1998 


3784 


5570 


784CIP2 215 


7858 | 


213 


1999 


3785 


5571 


784CIP2 216 


7858 


214 


2000 


3786 


5572 


784CIP2_217 


7861 


215 


2001 


3787 


5573 


784CIP2_218 


7866 


216 


20 02 


3788 


5574 


7B4CIP2_219 


786*8 


217 


2003 


3789 


5575 


784CIP2 220 


7896 


218 


2004 


3790 


5576 


784CIP2_221 


7898 


219 


2005 


3791 


5577 


784CIP2_222 


7900 


220 


2006 


3792 


5578 


784CIP2_223 


7906 


221 




2007 


3793 


5579 


784CIP2J224 


7908 




2008 


3 794 


5580 


784CIP2 225 


7909 




2009 


3795 


5581 


784CIP2_226 


7917 


224 


2010 


3796 


5582 


784CIP2_227 


7932 


225 


2011 


3797 


5583 


784CIP2_228 


7940 


226 


2012 


3798 


5584 


784CIP2_229 


7940 


227 


2013 


3799 


5585 


784CIP2_230 


7984 


O O Q 


2014 


3 800 


5586 


784CIP2_231 


7984 


Tin 


2015 


3 801 


5587 


784CIP2_232 


8001 


*5 "3 n 


2016 


3802 


5588 


784CIP2 233 


8021 


231 


2017 


3803 


5589 


784CIP2 234 


8029 


232 


2018 


3804 


5590 


7B4CIP2_235 


8033 


233 


2019 


3805 


5591 


784CIP2_236 


8040 


234 


2020 


3806 


5592 


784CIP2_237 


8052 


235 


2021 


3807 


5593 


784CIP2_238 


8096 


236 


2022 


3808 


5594 


784CIP2_239 


8096 


Oil 


inn 


3809 


5595 


784CIP2_240 


8113 


238 


2024 


3810 


5596 


784CIP2_241 


8126 


239 


2025 


3811 


5597 


784CIP2_242 


8132 


240 


2026 


3812 


5598 


784CIP2 243 


8137 


241 


202 7 


3813 


5599 


784CIP2 244 


8137 


242 


2028 


3814 


5S00 


784CIP2 245 


8159 


243 


2029 


3815 


5601 


784CIP2_246 


8159 


244 


2030 


3816 


5602 


784CIP2 247 


8161 


245 


2031 


3817 


5603 


784CIP2_248 


B176 
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SEQ ID NO: 
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SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 
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length 
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length 


sequence 
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SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




246 


2032 


3818 


5604 


784CIP2_249 


8196 


247 


2033 


3819 


5605 


784CIP2 250 


8200 


248 


2034 


3820 


5606 


784CIP2 251 


8212 


249 


2035 


3821 


5607 


784CIP2_252 


8220 


250 


2036 


3822 


5608 


784CIP2_253 


8238 


251 


2037 


3823 


5609 


7B4CIP2 254 


8254 


252 


2038 


3824 


5610 


784CIP2_255 


8255 


253 


2039 


3825 


5611 


784CIP2 256 


6268 


254 


2040 


3826 


5612 


784CIP2_257 


8296 


255 


2041 


3827 


5613 


784CIP2_258 


8329 


256 


2042 


3828 


5614 


784CIP2_259 


8362 


257 


2043 


3829 


5615 


784CIP2_260 


8429 


258 


2044 


3830 


5616 


784CIP2_261 


6436 • 


259 


2045 


3831 


5617 


784CIP2_262 


8448 


260 


2046 


3832 


5618 


784CIP2_263 


8472 


261 


2047 


3833 


5619 


784CIP2_264 


8502 


262 


2048 


3834 


5620 


784CIP2_265 


8504 


263 


2049 


3835 


5621 


784CIP2 266 


8507 


264 


2050 


3836 


5622 


784CIP2__268 


8509 


265 


2051 


3837 


5623 


784CIP2 269 


8515 


266 


2052 


3838 


5624 


784CIP2_270 


8519 


267 


2053 


3839 


5625 


784CIP2_271 


8530 


268 


2054 


3840 


5626 


784CIP2 272 


8532 


269 


2055 


3841 


5627 


784CIP2 273 


8532 


270 


2056 


3842 


5628 


784CIP2 274 


0539 


271 


2057 


3843 


5629 


784CIP2_275 


8541 


272 


2058 


3844 


5630 


784CIP2 276 


8543 


273 


2059 


3845 


5631 


784CIP2 277 


8593 


274 


2060 


3846 


5632 


. 784CIP2 278 


8595 


275 


2061 


3847 


5633 


784CIP2_279 


8615 


276 


2062 


3848 


5634 


784CIP2_280 


8620 


277 


2063 


3849 


5635 


784CIP2_281 


8621 


278 


2064 


3850 


5636 


784CIP2 282 


8623 


279 


2065 


3851 


5637 


784CIP2_283 


8625 


280 


2066 


3852 


5638 


784CIP2 284 


8628 


281 


2067 


3853 


5639 


784CIP2 285 


8628 


282 


2068 


3854 


5640 


764ClP2_£8S 


8629 


283 


2069 


3855 


5641 


784CIP2 287 


8630 


284 


2070 


3856 


5642 


784CIP2_288 


8631 


285 


2071 


3857 


5643 


784CIP2 289 


8633 


286 


2072 


3858 


5644 


784CIP2_290 


8634 


287 


2073 


3859 


5645 


784CIP2_291 


8635 


288 


2074 


3860 


5646 


784CIP2_292 


8636 


289 


2075 


3861 


5647 


784CIP2_293 


8659 


290 


2076 


3862 


5648 


784CIP2J294 


86£0 


291 


2077 


3863 


5^49 


784CIP2 295 


8667 1 


292 


2078 


3864 


5650 


784CIP2 296 


8667 


293 


2079 


3865 


5651 


784CIP2 297 


8685 


294 


2080 


3866 


5652 


784CIP2 298 


8805 


295 


2081 


3867 


5653 


784CIP2 299 


8896 


296 


2082 


3868 


5654 


784CIP2_300 


8978 


297 | 


2083 


3869 


5655 


784CIP2_301 


9046 


298 1 


2084 


3870 


5656 


784CIP2_302 


9048 


299 j 


2085 


3871 


5657 


784CIP2 303 


9116 


300 


2086 


3872 


5658 


784CIP2_304 


9195 


301 


2087 


3873 


5659 


784CIP2_305 


9201 


3 02 


2088 


3874 


5660 


784CIP2 306 


9307 


303 


2089 


3875 


5661 


784CIP2 307 


9321 


304 


2090 


3876 


5662 


784CIP2_308 


9397 


305 


2091 


3877 


5663 


7B4CIP2_309 


9405 


306 


2092 


3878 


5664 j 


7B4CIP2__310 


9406 


307 


2093 


3879 ' 


5^5 


7B4CIP2_311 


9422 
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SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 
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length 


full- 


nucleotide 
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U.S. S.N. 


nucleotide 


length' 


sequence 


peptide 


SEQ ID NO: in 
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sequence 


peptide 
sequence 




sequence 


priority 
application 




308 


2094 


3880 


5666 


7 84CIP2_312 


9494 


309 


2095 


3881 


5667 


784CIP2__313 


9512 


310 


2096 


3882 


5668 


784CIP2_314 


9632 


311 


2097 


3883 


5669 


784CIP2_315 


9661 


312 


2098 


3884 


5670 


784CIP2_316 


9664 


313 


2099 


3885 


5671 


784CIP2J317 


9691 


314 


2100 


3886 


5672 


784CIP2_318 


9700 


315 


2101 


3887 


5673 


784CIP2 319 


9716 


316 


2102 


3B88 


5674 


784CIP2_320 


9721 


317 


2103 


3889 


5675 


784CIP2 321 


9870 


318 


2104 


3890 


5676 


784CIP2 322" 


9887 


319 


2105 


3891 


5677 


764CIP2_323 


9923 


320 


2106 


3892 


5678 


784CIP2_324 


9938 


321 


2107 


3893 


5679 


784CIP2_325 


9964 


322 


2108 


3894 


5680 


784CIP2_326 


10007 


3^3 


2109 


3895 


5681 


784CIP2_327 


10009 


324 


2110 


3896 


5682 


784CIP2_328 


10046 


325 


2111 


3897 


5683 


784CIP2 329 


10156 j 


326 


2112 


3898 


5684 


784CIP2_330 


1027* 


327 


2113 


3899 


5685 


784CIP2_331 


10283 


328 


2114 


3900 


5666 


784CIP2BJL 


152 


329 


2115 


3901 


5687 


7B4CIP2B_2 


167 


330 


2116 


3902 


5688 


784CIP2B_3 


205 


331 


2117 


3903 


5689 


784CIP2B 4 


210 


332 


2118 


3904 


5690 


7B4CIP2B_5 


225 


333 


2119 


3905 


5691 


784CIP2B_6 


226 


334 


2120 


3906 


5692 


7 84CIP2BJ7 


264 


335 


2121 


3907 


5693 


784CIP2B_8 


268 


336 


2122 


3908 


5694 


784CIP2B 9 


293 


337 


2123 


3909 


5695 


784CIP2B_10 


293 


338 


2124 


3910 


5696 


784CIP2B 11 


293 


339 


2125 


3911 


5697 


784CIP2B_12 


302 


340 


2126 


3912 


5698 


784CIP2B_13 


311 | 


341 


2127 


3913 


5699 


784CIP2B 14 


352 


342 


2128 


3914 


5700 


784CIP2B_15 


3 58 


343 


2129 


3915 


5701 


784CIP2B_16 


368 


344 


2130 


3916 


5702 


784CIP2B_17 


393 


345 


2131 


3917 


5703 


784CIP2B_1S 


477 


346 


2132 


3918 


5704 


784CIP2B_19 


508 


347 


2133 


3915 


5705 


784CIP2B_20 


508 


348 


2134 


3920 


5706 


784CIP2B_21 


515 


349 


2135 


3921 


5707 


784CIP2B_22 


578 


350 


2136 


3922 


5708 


784CIP2B_23 


588 


351 


2137 


3923 


5709 


784CIP2B_24 


591 


352 


2138 


3924 


5710 


784CIP2B_25 


593 


353 


2139 


3925 


5711 


784CIP2B_26 


594 


354 


2140 


3926 


5712 


784CIP2B_27 


619 


355 


2141 


3927 


5713 


784CIP2B_28 


620 


356 


2142 


3928 


5714 


784CIP2B_29 


654 


357 


2143 


3929 


5715 


784CIP2B 30 


692 


358 


2144 


3930 


5716 


784CIP2B_31 


753 


359 


2145 


3931 


5717 


784CIP2BJ32 


758 


360 


2146 


3932 


5718 


784CIP2B_33 


787 


i 361 


2147 


3933 


5719 


784CIP2B 34 


833 


362 


2148 


3934 


5720 


7B4CIP2B_35 


838 j 


363 


2149 


3935 


5721 


7B4CIP2B_36 


870 


364 


2150 


3936 


5722 


7B4CIP2B_37 


891 


365 


2151 


3937 


5723 


7 84CIP2B_3 8 


891 


366 


2152 


3938 


5724 


784CIP2B_39 


921 


367 


2153 


3939 


5725 


784CIP2B_40 


924 


368 


2154 


3940 


5726 


784CIP2B_41 


932 


369 


2155 


3941 


5727 


784CIP2B_42 


942 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 




2156 


3942 


5728 


784CIP2B_43 


958 




2 157 


3943 


5729 


7B4CIP2B 44 


968 


7 79 


2 158 


3944 


573 0 


784CIP2B 45 


992 


J / j 


z iby 


j y4b 


573 1 


784CIP2B 46 


1025 


7 74 


2160 


3946 


5732 


784CIP2B_47 


1074 


7 7S 


£ lb 1 


3947 


5733 


784CIP2B 48 


1104 


O / D 


4 1d2 


3948 


5734 


7 84CIP2B_4 9 


1114 


■577 


2163 


3949 


5735 


784CIP2B_50 


1144 


378 


■c, lo4 


3950 


5736 


784CIP2B 51 


1262 


"3 *7 Q 

J /y 


2165 


3951 


5737 


784CIP2B 52 


1318 


380 


2166 


3952 


5738 


784CIP2B 53 


1319 


3 81 


2167 


3953 


5739 


784CIP2B 54 


1328 


382 


2168 


3954 


5740 


784CIP2B 55 


1436 


383 


2169 


3955 


5741 


784CIP2B_56 


1464 


384 


2170 


3956 


5742 


784CIP2B 57 


1584 


385 


2171 


3957 


5743 


784CIP2B_58 


1617 


386 


2172 


3958 


5744 


784CIP2B 59 


1724 


387 


2173 


3959 


5745 


784CIP2B_60 


1728 


3 88 


2174 


3960 


574 6 


784CIP2B 61 


1772 


389 


2175 


3961 


5747 


784CIP2B_62 


1809 


*i o n 

j y u 


2176 


3962 


574 8 


784CIP2B 63 


1868 


*3 Q *1 


2177 


3963 


5749 


784CIP2B 64 


1898 


392 


2178 


3964 


5750 


784CIP2B_65 


1926 


^ o ^ 


2179 


3965 


5751 


784CIP2B_66 


196"5 


394 


2180 


3966 


5752 


784CIP2B 67 


1967 


Tor 

J y b 


2181 


3967 


5753 


784CIP2B_68 


1995 


iyfa 


2182 


3968 


5754 


i 784CIP2B_69 


2005. 


397 


2183 


3969 


5755 


784CIP2BJ70 


2027 


398 


2184 


3970 


5756 


784CIP2BJ71 


2055 


Too 

399 


2185 


3971 


5757 


784CIP2B 72 


2103 


Ton 

4 00 


2186 


3972 


5750 


784CIP2B 73 


2106 


401 


2187 


3973 


5759 


784CIP2B 74 


2166 


402 


2188 


3974 


5760 


784CIPiB_75 


2175 


403 


2189 


3975 


5761 


784CIP2B_76 


2176 


404 


2190 


3976 


5762 


784CIP2B_78 


2236 


4 OS 


2191 


3977 


5763 


784CIP2B_79 


2250 


406 


2192 


3978 


5764 


784CIP2B_80 


2300 . 


407 


2193 


3979 


• 5765 


784CIP2B 81 


2323 


408 


2194 


3980 


5766 


784CIP2B_82 


2340 


409 


2195 


3981 


5767 


784CIP2B_83 


2371 


410 


2196 


3982 


5768 


784CIP2B_84 


2399 


411 


2197 


3 983 


5769 


784CIP2B_85 


2411 


aTo 

412 


2198 


3984 


5770 


784CIP2B 86 


2428 


A 1 *5 

41.3 


2199 


3905 


5771 


784CIP2B_87 


2430 


ATA 


2200 


3986 


5772 


784CIP2B_88 


2439 


415 


2201 


3987 


5773 


784CIP2B 89 


2447 


41b 


2202 


3988 


5774 


784CIP2B_90 


2461 


Zv5 


2203 


3989 


5775 


784CIP2B_91 


2487 


AID 
4 lo 


2204 


3990 


5776 


784CIP2B_92 


2492 


419 


2205 


3991 


5777 


784CIP2B_93 


2512 


4<;0 


2206 


3992 


5778 


784CIP2B_94 


2564 


421 


2207 


3993 


5779 


784CIP2B_95 


2578 


422 


2208 


3994 


5780 


784CIP2B_96 


2816 


423 


77nq 


3995 


5781 


784CIP2B 97 


2818 


424 


2210 


3996 


5782 


784CIP2B 98 


2819 


425 


2211 


3997 


5783 


784CIP2B_99 


2943 


426 


2212 


3998 


5784 


784CIP2B 100 


3137 


427 


2213 


3999 


5785 


784CIP2B_101 


3137 


428 


2214 


4000 


5786 


784CIP2B 102 


3160 — * 


429 


2215 


4001 


5787 


784CIP2B 103 


3323 


430 


2216 


4002 


5788 


784CIP2B_104 


3360 


431 


2217 


4003 


5789 


784CIP2B 105 


3362 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U. S.S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




43 2 


2218 


4004 


5790 


784CIP26 106 


3417 


433 


2219 


4005 


5791 


784CIP2B_107 


3418 


434 


2220 


4006 


5792 


784CIP2B_108 


3442 


435 


2221 


4007 


5793 


784CIP2B_109 


3442 


a^TF 

436 


2222 


4008 


5794 


784CIP2B_110 


3444 


437 


2223 


4009 


5795 


784CIP2B_111 


3855 


438 


2224 


4010 


5796 


784CIP2B_112 


3863 


439 


2225 


4011 


5797 


784CIP2B_113 


4090 


— 

440 


2226 


4012 


5798 


784CIP2B_114 


4105 


441 


2227 . 


4013 


5799 


784CIP2B_115 


4142 


442 


2228 


4014 


5800 


784CIP2B 116 


4142 


443 


2229 


4015 


5801 


784CIP2B_117 


4149 


444 


2230 


4016 


5802 


784CIP2B 118 


4196 


445 


2231 


4017 


5B03 


784CIP2B_119 


4202 


446 


2232 


4018 


5804 


784CIP2B_120 


4274 


447 


2233 


4019 


5805 


784CIP2BJL21 


4304 


448 


2234 


4020 


5806 


784CIP2BJL22 


4306 


449 


2235 


4021 


5807 


784CIP2B 123 


4311 


450 


2236 


4022 


5808 


784CIP2B_124 


4321 


451 


2237 


4023 


5809 


784CIP2B_125 


4323 


452 


2238 


4024 


5810 


784CIP2B_126 


4332 


Ten 

453 


2239 


4025 


5811 


784CIP2B_127 


4488 


454 


2240 


4026 


5812 


784CIP2B__128 


4588 


455 


2241 


4027 


5813 


7B4CIP2B_129 


5569 


456 


2242 


4028 


5814 


784CIP2B_130 


5573 


457 


2243 


4029 


5815 


784CIP2B 131 


5577 


458 


2244 


4030 


5816 


784CIP2B 132 


5579 


459 


2245 


4031 


5817 


784CIP2B_133 


5582 


460 


2246 


4032 


5818 


784CIP2B_134 


5583 


461 


2247 


4033 


5819 


7B4CIP2BJL35 


5584 


462 


2248 


4034 


5820 


784CIP2B_136 


5585 


463 


2249 


4035 


5821 


784CIP2B_137 


5591 


464 


2250 


4036 


5822 


734CIP2B__138 


5593 


465 


2251 


4037 


5823 


784CIP2B_139 


5594 


466 


2252 


4038 


5824 


784CIP2BJL40 


5594 


467 


2253 


4039 


5825 


784CIP2B_141 


5598 


468 


2254 


4040 


5826 


784CIP2BJL42 


5602 


469 


2255 


4041 


5827 


784CIP2B 143 


5605 


470 


2256 


4042 


5828 


784CIP2B 144 


5608 


471 


2257 


4043 


5829 


784CIP2B__145 


5617 




472 


225B 


4044 


5830 


784CIP2B 146 


5620 


473 


2259 


4045 


5831 


784CIP2BJL47 


56*22 


474 


2260 


4046 


5832 


784CIP2BJL48 


5623 


475 


2261 


4047 


5833 


784CIP2B_149 


5624 


476 


2262 


4048 


5834 


784CIP2BJL50 


5625 


477 


2263 


4049 


5835 


784CIP2B_151 


5627 


478 


226*4 


4050 


5836 


784CIP2B_152 


5^23 


yPTn 

479 


2265 


4051 


5837 


784CIP2B_153 


5630 


480 


2266 


4052 


5838 


784CIP2B_154 


5632 


481 


2267 


4053 


5839 


784CIP2B 155 


5640 


482 


2268 


4054 


5840 


784CIP2B_156 


5641 


483 


2269 


4055 


5841 


784CIP2B_157 


5643 


484 


2270 


4056 


5842 


784CIP2B__158 


5647 


485 


2271 


4057 


5843 


784CIP2B 159 


5649 


486 


2272 


4058 


5844 


784C1P2B l^b 


5658 


487 


2273 


4059 


5845 


784CIP2BJL61 


5659 


488 


2274 


4060 


5846 


784CIP2B 162 


5667 


469 


2275 


4061 


5847 


784CIP2B 163 


5672 


490 


2276 


4062 


5848 


784CIP2B_164 


5674 


j491 


2277 


4063 


5849 


784CIP2B 165 


5678 


492 


2278 


4064 


5850 


784CIP2B 166 


5680 


493 


2279 


4065 


5851 


784CIP2B_167 


5684 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number_ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




494 


2280 


4066 


5852 


784CIP2B_168 


5686 


495 


2281 


4067 


5853 


784CIP2B_169 


5694 


496 


2282 


4068 


5854 


784CIP2BJL70 


5698 


497 


2283 


4069 


5855 


784CIP2B 171 


5699 


498 


2284 


4070 


5856 


784CIP2B_172 


5712 


499 


2285 


4071 


5857 


784CIP2B 173 


5719 


500 


2286 


4072 


5858 


784CIP2B 174 


5720 


501 


2287 


4073 


5859 


784CIP2B 175 


5727 


502 


2288 


4074 


5860 


784CIP2BJL76 


5730 


503 


2289 


4075 


5861 


784CIP2B_177 


5734 


504 


2290 


4076 


5862 


784CIP2BJL78 


5738 


505 


2291 


4077 


5863 


784CIP2B 179 

5= 


5739 


506 


2292 


4078 


5864 


784CIP2B 180 


5740 


507 


2293 


4079 


5865 


784CIP2BJL81 


5744 


508 


2294 


4080 


5866 


784CIP2B 182 


5748 


509 


2295 


4081 


5867 


784CIP2BJL83 


5749 


510 


2296 


4082 


5868 


784CIP2B_184 


5750 


511 


2297 


4083 


5869 


784CIP2B_185 


5750 


512 


2298 


4084 


5870 


784CIP2B w 186 


5750 


513 


2299 


4085 


5871 


784CIP2BJL87 


5761 


514 


2300 


4086 


5872 


784CIP2B_188 


5762 


515 


2301 


4087 


5873 


784CIP2B 189 


5767 


516 


2302 


4088 


5874 


7B4CIP2BJ190 


5773 


517 


2303 


4089 


5875 


784CIP2B 191 


5783 


518 


2304 


4090 


5876 


7B4CIP2B 192 


5784 


519 


2305 


4091 


5877 


7B4CIP2BJL93 


5788 


520 


2306 


4092 


5878 


784CIP2B 194 


5798 


521 


2307 


4093 


5879 


784CIP2B_196 


5807 


522 


2308 


4094 


5880 


784CIP2B_197 


5818 


523 


2309 


4095 


5881 


7B4CIP2BJL98 . 


5819 


524 


2310 


4096 


5882 


784CIP2B_199 


5827 


525 


2311 


4097 


5883 


784CIP2B_200 


5828 


526 


2312 


4098 


5884 


784CIP2B 201 


5842 


527 


2313 


4099 


5885 


784CIP2B 202 


5853 


528 


2314 


4100 


5886 


784CIP2B 203 


5861 


529 


2315 


4101 


5887 


784CIP2B_204 


5864 


530 


2316 


4102 


5888 


784CIP2B 205 


5865 


531 


2317 


4103 


5889 


784CIP2BJ206 


5871 


532 


2318 


4104 


5890 


784CIP2B_207 


5873 


533 


2319 


4105 


5891 


784CIP2B_208 


5873 


534 


2320 


4106 


5892 


784CIP2B_209 


5875 


535 


2321 


4107 


5893 


784CIP2B 210 


5878 


536 


2322 


4108 


5894 


784CIP2B 211 


5879 


537 


2323 


4109 


5395 


784CIP2B_212 


5880 


538 


2324 


4110 


5896 


784CIP2B_213 


5880 


539 


2325 


4111 


5897 


7B4CIP2B_214 


5880 


540 


2326 


4112 


5898 


784CIP2B_215 


5880 


541 


2327 


4113 


5899 


7B4CIP2BJ216 


5885 


542 


2328 


4114 


5900 


784CIP2B_217 


5895 


543 


2329 


4115 


5901 


784CIP2B_21B 


5898 


544 


2330 


4116 


5902 


784CIP2B_219 


5902 


545 


2331 


4117 


5903 


784CIP2B 220 


5904 


546 


2332 


4118 | 


5904 


784CIP2B_221 


5918 


547 


2333 


4119 


5905 


784CIP2B 222 


5921 


548 


2334 


4120 


5906 


784CIP2B 223 


5927 


549 


2335 


4121 


5907 


784CIP2B 224 


5932 


550 


2336 


4122 


5908 


784CIP2B_225 


5939 


551 | 


2337 


4123 


5909 


784CIP2B 226 


5945 j 


552 


" 2338 


4124 


5910 


784CIP2B 227 


594S 


553 


2339 


4125 


5911 


784CIP2B_228 


5947 


554 


2340 


4126 


5912 


784CIP2B 229 


5956 


555 


2341 


4127 


5913 


784CIP2B_230 


5967 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priori ty 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number 


NO : in 


length 


full- 


nucleotide 


of contig 


corresponding 


U. S .S . N . 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




556 


2342 


4128 


5914 


784CIP2B_232 


5975 


557 


2343 


4129 


5915 


784CIP2B_233 


5977 


558 


2344 


4130 


5916 


784CIP2B 234 


5978 


559 


2345 


4131 


5917 


784CIP2B__235 


5979 


560 


2346 


4132 


5918 


784CIP2B_ > _236 


5980 


561 


2347 


4133 


5919 


784CIP2B_237 


£988 


562 


2348 


4134 


5920 


784CIP2B__23B 


5989 


563 


2349 


4135 


5921 


784CIP2B_239 


5991 


564 


2350 


4136 


5922 


784CIP2B_240 


5997 


565 


2351 


4137 


5923 


784CIP2B_241 


5998 


566 


2352 


4138 


5924 


784CIP2B_242 


6003 


567 


2353 


4139 


5925 


784CIP2B_243 


6004 


568 


2354 


4140 


5926 


784CIP2B__244 


6013 


569 


2355 


4141 


5927 


784CIP2B_245 


6028 


570 


2356 


4142 


5928 


784CIP2BJ246 


6028 


571 


2357 


4143 


5929 


784CIP2B__247 


6029 


572 


2358 


4144 


5930 


784CIP2B_248 


6031 


573 


2359 


4145 


5931 


784CIP2B 249 


6031 


574 


2360 


4146 


5932 


784CIP2B 250 


6032 


575 


2361 


4147 


5933 


784CIP2BJ251 


6037 


576 


2362 


4148 


5934 


784CIP2B_252 


6037 


577 


2363 


4149 


5935 


784CIP2B_253 


6043 


578 


2364 


4150 


5936 


784CIP2B 254 


6044 


579 


2365 


4151 


5937 


784CIP2B_255 


6046 


580 


2366 


4152 


5938 


784CIP2B_256 


6046 


581 


2367 


4153 


5939 


784CIP2B__257 


6049 


582 


2368 


4154 


5940 


784CIP2B 258 


6051 


583 


2369 


4155 


5941 


784CIP2B_259 


6053 


584 


2370 


4156 


5942 


784CIP2B_260 


6060 


585 


2371 


4157 


5943 


784CIP2B 261 


6063 


586 


2372 


415B 


5944 


7 84CIP2B_262 


6066 


587 


2373 


4159 


5945 


784CIP2B_263 


6067 


588 


2374 


4160 


5946 


784CIP2B 264 


6068 


589 


2375 


4161 


5947 


784CIP2B_265 


6073 


590 


2376 


4162 


5948 


784CIP2B 266 


6076 


591 


2377 


4l*3 


5949 


7&4CIP2B" 267 


6076 


592 


2378 


4164 


5950 


784CIP2B 268 


6077 


593 


2379 


4165 


5951 


784CIP2B 269 


6079 


594 


2380 


4166 


5952 


784CIP2B_270 


6082 


595 


2381 


4167 


5953 


784CIP2B_272 


6088 


596 1 


2382 


4168 


5954 


784CIP2B_273 


6091 


597 


2383 


4169 


5955 


784CIP2B_274 


6094 


598 


2384 


4170 


5956 


784CIP2B_275 


6101 


599 


2385 


4171 | 


5957 


784CIP2B_276 


6103 


600 


2386 


4172 


5958 


784CIP2B_277 


6104 


601 


2387 


4173 


5959 


784CIP2B_278 


6108 


602 


2388 


4174 


5960 


784CIP2B_279 


6112 


603 


2389 


4175 


5961 


784CIP2B 280 


6121 


604 


2390 


4176 


5962 


784CIP2B_281 


6125 


605 


2391 


4177 


5963 


784CIP2B_282 


6126 


606 


2392 


4178 


5964 


784CIP2B 283 


6128 


607 


2393 


4179 


5965 


784CIP2B 284 ! 


6129 


608 


2394 


4180 


5966 


784CtP2B_285 


6133 


609 


2395 


4181 


5967 


784CIP2B 286 


6133 


610 


2396 


4182 


5968 


784CIP2B_287 


6135 


611 


2397 


4183 


5969 


784CIP2B_288 


*139 


612 


2398 


4184 


5970 


784CIP2B_289 


6141 


613 


2399 


4185 


5971 


784CIP2B_290 


6145 


614 


2400 


4186 


5972 


784CIP2B 291 


6146 


615 


2401 


4187 


5973 


784C1P2B 292 


6148 


616 


2402 . 


4188 


5974 


784CIP2B 293 


6149 


617 


2403 . 


4189 


5975 


784CIP2B_294 


6149 



280 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




618 


2404 


4190 


5976 


784CIP2B_295 


6153 


619 


2405 


4191 


5977 


784CIP2B_296 


6159 


620 


2406 


4192 


5978 


784CIP2B_297 


6164 


621 


2407 


4193 


5979 


784CIP2B_298 


6167 


622 


2408 


4194 


5980 


784CIP2B_299 


6172 


623 


2409 


4195 


5981 


784CIP2B_300 


6173 


624 


2410 


4196 


5982 


7B4CIP2B_301 


6190 


625 


2411 


4197 


5983 


784CIP2B 302 


6194 


626 


2412 


4198 


5984 


784CIP2B_3 03 


6196 


627 


2413 


4199 


5985 


784CIP2B_304 


6197 


628 


2414 


4200 


5986 


784CIP2B 305 


6198 


629 


2415 


4201 


5987 


784CIP2B_306 


6198 


630 


2416 


4202 


5988 


784CIP2B_308 


6214 


631 


2417 


4203 


5989 


784CIP2B_309 


6215 


632 


2418 


4204 


5990 


784CIP2B_310 


6219 


£33 


2419 


4205 


5991 


i 784CIP2B 311 


6226 


• 634 


2420 


4206 


5992 


784CIP2B_312 


6229 


635 


2421 


4207 


5993 


784CIP2B_313 


6234 


636 


2422 


4208 


5994 


784CIP2B_314 


6237 


637 


2423 


4209 


5995 


784CIP2B_315 


6238 


638 


2424 


4210 


5996 


784CIP2B_316 


6239 


639 


2425 


4211 


5997 


784CIP2B 317 


6239 


640 


2426 


4212 


5998 


784CIP2B_318 


6239 


641 


2427 


4213 


5999 


784CIP2B_319 


6240 


642 


2428 


4214 


6000 


784CIP2B 320 


6244 


643 


2429 


4215 


6001 


784CIP2B_321 


6245 


644 


I 2430 


4216 


6002 


784CIP2B_322 


6250 


645 


2431 


4217 


6003 


784CIP2B 323 


6252 


646 


2432 


4218 


6004 


784CIP2B 324 


6252 


647 


2433 


4219 


6005 


784CIP2B_325 


6256 


648 


2434 


4220 


6006 


784CIP2B_326 


6260 


649 


2435 


4221 


6007 


784CIP2B 327 


6261 


650 


2436 


4222 


6"008 


784CIP2B_328 


6264 | 


651 


2437 


4223 


6009 


784CIP2B_329 


6265 


652 


2430 


4224 


6010 


784CIP2B_330 


6266 [ 


653 


2439 


4225 


6011 


784CIP2B_331 


6270 


654 


2440 


4226 


6012 


784CIP2B_332 


6271 


655 


2441 


4227 


6013 


784.CIP2B_334 


6274 


656 


2442 


4228 


6014 


784CIP2B_335 


6276 


657 


2443 


4229 


6015 


784CIP2B_336 


6281 


658 


2444 


4230 


6016 


784CIP2B_337 


6281 


659 


2445 


4231 


6017 


784CIP2B_338 


6288 


660 


2446 


4232 


6018 


784CIP2B_339 


6292 


661 


2447 


4233 


6019 


784CIP2B__340 


6294 


662 


2448 


4234 


6020 


784CIP2B 343 


6312 


663 


2449 


4235 


£021 


784CIP2B 344 


6312 


664 


2450 


4236 


6022 


784CIP2B_345 


6312 


665 


2451 


4237 


6023 


784CIP2B_346 


6322 


666 


2452 


4238 


6024 


784CIP2B_347 


6324 


667 


2453 


4239 


6025 


784CIP2B 349 


6329 


668 


2454 


4240 


6026 


784CIP2B 350 


6331 


669 


2455 


4241 


6027 


784CIP2B_351 


6333 


670 


2456 


4242 


6028 


784CIP2B 352 


6334 


671 


2457 


4243 


6029 


784CIP2B 353 


6337 


672 


2458 


4244 


6030 


784CIP2B 354 


6339 


673 


2459 


4245 


6031 


784CIP2B 355 


6346 


674 


2460 


4246 


6032 


784CIP2B 356 


6348 


675 


2461 


4247 


6033 


784CIP2B 357 


6348 


676 


2462 


4248 


6034 


784CIP2B 358 


6350 


677 


2463 


4249 


6035 


784CIP2B_359 


6351 


678 


2464 


4250 


6036 


784CIP2B 360 


6355 


679 


2465 


4251 


6037 I 


784CIP2B 361 


6362 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority- 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




680 


2466 


4252 


6038 


784CIP2B_362 


6368 


681 


2467 


4253 


6039 


784CIP2B_363 


6369 


682 


2468 


4254 


6040 


784dP2B_364 


6371 


COT 
DO J 


2469 


4255 


6041 


784CIP2B 365 


6376 


684 


2470 


4256 


6042 


784CIP2B 366 


63 79 


bob 


2471 


4257 


6043 


784CIP2B_367 


6380 


686 


2472 


4258 


6044 


784CIP2B_368 


6381 


687 


2473 


4259 


6045 


784CIP2B 369 


6392 


688 


2474 


4260 


6046 


7B4CIP2BJ370 


6395 


689 


2475 


4261 


6047 


784CIP2B 371 


6397 


690 


2476 


4262 


6048 


784CIP2B 372 


6400 


691 


2477 


4263 


6049 


784CIP2B_373 


6401 


£92 


2478 


4264 


6050 


784CIP2B_374 


6411 


693 


2479 


4265 


6051 


784CIP2B 375 


6411 


694 


2480 


4266 


6052 


784CIP2B_376 


6411 


695 


2481 


4267 


6053 


784CIP2B_377 


6416 


696 


2482 


4268 


6054 


784CIP2B 378 


6418 


7/n 

697 


2483 


4269 


6055 


784CIP2B 379 


6422 


698 


24B4 


4270 


6056- 


784CIP2B_380 


6423 


699 


2485 


4271 


6057 


784CIP2B 381 


6426 


700 


2486 


4272 


6058 


784CIP2B_382 


6427 


701 


2487 


4273 


6059 


784CIP2B_383 


6428 


702 


2488 


4274 ' 


6060 


784CIP2B_384 


6429 


703 


2489 


4275 


6061 


784CIP2B_385 


6430 


704 


2490 


4276 


6062 


784CIP2B_386 


6432 


705 


2491 


4277 


6063 


784CIP2B_387 


6432 


706 


2492 


4278 


6064 


784CIP2B_388 


6438 


707 


2493 


4279 


6065 


784CIP2B_389 


6441 


708 


2494 


4280 


6066 


784CIP2B_390 


6446 


709 


2495 


4281 


6067 


784CIP2B_391 


6454 


710 


2496 


4282 


6068 


784CIP2B_392 


6459 


711 


2497 


4283 


6069 


784CIP2B 394 


6461 


712 


2498 


4284 


6070 


784<iiWB_395 


£46*7 


713 


2499 


4285 


6071 


784CIP2B__396 


6468 


714 


2500 


4286 


6072 


784CIP2B_397 


6487 


715 


2501 


4287 


6073 


784CIP2B_398 


6491 


716 


2502 


4288 


6074 


784CIP2B_399 


6506 


717 


2503 


4289 


6075 


784CIP2B_401 


6514 


718 


2504 


4290 


6076 


784CIP2B_402 


6519 


719 


2505 


4291 


6077 


784CIP2B_403 


6521 


720 


2506 


4292 


6078 


784CIP2B_404 


6532 


721 


2507 


4293 


6079 


784CIP2B 405 


6536 


722 


2508 


4294 


6080 


784CIP2B_406 


6543 


723 


2509 ! 


4295 


6081 


784CIP2B_407 


6544 


724 


2510 | 


4296 


6082 


784CIP2B_408 


654 8 


725 


2511 


4297 


6083 


784CIP2B_409 


£551 


726 


2512 


4298 


6084 


784CIP2B 410 


6551 


727 


2513 


4299 


6085 


784CIP2B_411 


6552 


728 


2514 


4300 


6086 


784CIP2B_412 


6554 


729 

tTt; 


2515 


4301 


6087 


784CIP2B_413 


6556 


730 ; 


2516 


4302 


6088 


784CIP2B_414 


6560 


731 


2517 


4303 


6089 


784CIP2B_415 


6563 


732 


2518 


4304 


6090 


784CIP2B_416 


6564 


733 


2519 


4305 


6091 


784CIP2B_417 


6*567 


734 


2520 


430b 1 


6092 


784CIP2B 418 


6573 


735 


2521 


4307 


6093 


784CIP2B_419 


6575 


736 


2522 


4308 


6094 


784CIP2B 420 


6577 


737 


2523 


4309 


6095 


784CIP2B 421 


6593 


738 


2524 j 


4310 


6096 


784CIP2B 422 


6595 


739 


2525 


4311 


6097 


784CIP2B_423 


€599 


740 


2526 


4312 


6098 


784CIP2B_4 24 


6625 


741 


2527 


4313 


6099 


784CIP2B_425 


6625 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: Of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 




sequence 






application 




1 A O 


2528 


4314 


6100 


784CIP2B_426 


6626 


743 


2529 


4315 


6101 


784CIP2B_427 


6630 


/ «i *i 


2530 


4316 


6102 


784CIP2B_428 


6631 


745 


2531 


4317 


6103 


784CIP2B_429 


6632 


746 


2532 


4318 


6104 


784CIP2B 430 


6633 


/4 / 


2533 


4319 


6105 


784CIP2B 431 


6634 


/fkO 


2534 


4320 


6106 


784CIP2B 432 


6638 


74 9 


2535 


4321 


6107 


784CIP2B 433 


6641 


750 


2536 


4322 


6108 


784CIP2B_434 


6644 


751 


2537 


4323 


6109 


784CIP2B 435 


6646 


/ 5 A 


2538 


4324 


6110 


784CIP2B__436 


6648 


1 JJ 




4325 


6111 


784CIP2B_437 


6652 


/ 


Z D%t (J 


4326 


6112 


784CIP2B 438 


6654 


*7 e c 


2541 


4327 


6113 


784CIP2B_439 


6657 


/So 


2542 


4328 


6114 


784CIP2B_440 


6658 


/O / 


2543 


4329 


6115 


784CIP2B_441 


6663 


TED 


2544 


4330 


6116 


784CIP2B_442 


6664 




2545 


4331 


6117 


784CIP2B 443 


6668 


760 


2546 


4332 


6118 


784CIP2B_444 


6669 


761 


2547 


4333 


6119 


784CIP2B 445 


6673 


762 


2548 


4334 


6120 


784CIP2B_446 


6685 


763 


2549 


4335 


6121 


784CIP2B 447 


6687 


764 


2550 


4336 


6122 


784CIP2B_448 


6689 


765 


2551 


4337 


6123 


784CIP2B_449 


6693 


766 


2552 


4338 


6124 


784CIP2B 450 


6698 


767 


2553 


4339 


6125 


784CIP2B 451 


6699 


768 


2554 


4340 


6126 


784CIP2B_452 


6705 


769 


2555 


4341 


6127 


784CIP2B_453 


6711 


770 


2556 


4342 


6128 


784CIP2B_454 


67,13 


771 


2557 


4343 


6129 


784CIP2B_455 


6716 


772 


2558 


4344 


6130 


784CIP2B 456 


6725 


773 


2559 


4345 


6131 


784CIP2B_457 


6726 


774 


2560 


4346 


6132 


784CIP2B_;458 


6727 


775 


2561 


4347 


6133 


784CIP2B 459 


6730 


/ /h 


2562 


4348 


6134 


784CIP2B 460 


6730 


III 


2563 


4349 


6135 


784CIP2B_4 61 


6730 


778 


2564 


4350 


6136 


784CIP2B_462 


6732 


779 


2565 


4351 


6137 


784CIP2B 463 


6733 


780 


2566 


4352 


6138 


784CIP2B 464 


6*73 7 


fox 


2567 


4353 


6139 


784CIP2B_465 


6745 


TOI 

/ o*i 


n cf o " 

2568 


4354 


6140 


784CIP2B 466 


6751 


/OJ 


2569 


4355 


6141 


784CIP2B_467 


6754 


784 


2570 


4356 


6142 : 


784CIP2B_46B 


6758 


/ OS 


2571 


4357 


6143 


784CIP2B 469 


6761 


nor 
/oo 


2572 


4358 


6144 


784CIP2B_470 


6765 


/o / 


2573 


4359 


6145 


784CIP2B_471 


6768 


TOO 

/oo 


2574 


4360 


6146 


784CIP2B 472 


6773 


7BQ 
/ O 3 


2575 


4361 


6147 


784CIP2B_4 73 


6776 


/ y u 


2576 


4362 


6148 


784CIP2B_474 


6796 


791 


2577 


4363 


6149 


784CIP2B_475 


5798 


792 


.2578 


4364 


6150 


784CIP2B 476 


6823 


793 


2579 


4365 


6151 


784CIP2B 477 


6825 


794 


2580 


4366 | 


6152 


784CIP2B_478 


5826 




2581 


43 67 


6153 


784CIP2B 479 


6839 


796 


2582 


4368 


6154 


784CIP2B 480 


6*844 ' 


797 


2583 


436"9 " 


6155 


784CIP2B_4 82 


6849 


798 


2584 


4370 


6156 


784CIP2B__483 


6854 


799 


2585 


4371 


6157 


784CIP2B_484 


6857 


I 800 


2586 


4372 


6158 


764CIP2B 485 


6861 


801 


2587 


4373 | 


6159 


784CIP2B 486 


6873 


802 


2588 


4374 


6160 


784CIP2B 487 


6875 


803 


2589 


4375 | 


6161 


784CIP2B 488 


6877 
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SEQ ID NO : 
or luii* 

nucleotidp 
sequence 


SEQ ID 
wo : o£ 

IU11 - 
■*- ciiy Uii 

sequence 


SEQ ID NO: 
of contig 
nucleot ide 


SEQ ID 
NO : 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
ialiQ ID no : in 
priority 

dppxiLdLlOn 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


804 


2590 


43 76 


6l62 




688 0 


805 


..... 2^ 


4377 


dido 


/o4LJlr£D 490 


6885 


806 


2592 


4378 


O XO H 


/o^t.XP2B 491 


6890 


807 


2593 


4379 


6165 


THiPTDTQ A cn 


£89 0 


808 


2594 


4380 


616(5 


fO%k.Xir^o fli»«5 


6894 


809 


2595 


43 81 


6167 


r Oi^XrVrJ 


6901 


810 


2596 


4382 


010 0 


TQyi PTTnn * or 
/o^CXr'^tJ 495 


6904 


811 


2597 


4 3 83 


010 ^ 


rO^Lir^D 49b 


6907 


812 


2598 


4384 


6170 




6914 


813 


2599 


4385 


6171 


/ 0 *i L.1 tVo 49o 


6917 


814 ■ 


2600 


4386 


Ol / « 


'O^^JLr^o 499 


6923 


815 


2601 


4387 


6173 


/o4v.Xlr2B bUU 


6929 


816 


2602 


4 3 88 


O X / *± 


/a4LIP2B 501 


6931 


817 


2603 


4389 


0 X / D 


784CIP2B 502 


6935 


818 


260 4 


t J?U 


dI/o 


7B4CIP2B 503 


6940 


819 


2605 


A *J CM 


6177 


784CIP2B 504 


6945 


820 


2606 




bX / 0 


784CIP2B 505 


6946 


821 


2607 


** J J 




784CIP2B 506 


6947 


822 


26 , 08 




6180 


784CIP2B 507 


6949 


823 


2609 




0 XOX 


784CIP2B 508 


6959 


824 


2610 


*± J 70 




/84CIP2B 509 


6960 


825 


2611 


** .> j / 


bio j 


784CIP2B 510 


6962 


826 


2612 


4398 




/oILIF^B 511 


5963 


827 


2613 




6185 


7B4CIP2B 512 


6967 


828 


2614 


4400 


blob 


784CIP2B 513 


6983 


829 


2615 




0X0 / 


784CIP2B 514 


6988 


830 


2616 


44 02 


D X 0 O 


/H4CIP2B 515 


6996 


831 


2617 




6189 


784CIP2B_516 


7003 


832 


2618 


4404 


6190 


784CIP2B 517 


7016 


833 


2619 


4405 


6191 


7B4CIP2B 518 


7017 


834 


262 0 


4406 


6192 


784CIP2B 519 


7025 


835 


2621 


4407 


6193 


784CIP2B 520 


7025 


836 


2622 


4408 


6194 


784CIP2B_521 


7025 


837 


262 3 


4409 


6195 


784CIP2B 522 


7050 


83B 


2624 


4410 


6196 


784CIP2B 523 


7051 


839 


2625 


4 411 


6197 


784CIP2B 524 


7055 


840 


2626 


4412 | 6198 


784CIP2B 525 


7060 


841 


2627 


4413 


6199 


784CIP2B 526 


7064 


842 


262 8 


4414 


6200 


784CIP2B 527 


7067 


843 


2629 


4415 


6201 


784CIP2B 528 


7071 


844 


26 , 30' 


4416 


6202 


/a4LIP2n 529 


7072 


845 


2631 


4417 


6203 


784CIP2B_530 


7073 


846 


2632 


4418 


6204 


Ton fit n<<T> r* n ^ 

784CIP2B 531 


7076 


847 


2633 


4419 


6205 


/o4CIP2B 532 


7088 


848 


2b"34 


4420 


6206 


784CIP2B 533 


7089 


849 


2635 


4421 


6207 ] 


784CIP2B 534 


7091 


850 


263 6 


4422 


6208 


/o4CIP2B 535 


7091 


851 


2637 


4423 


6209 


/o4LXP2o 53b 


7104 


852 


2638 


4424 


6210 


784CIP2B 537 


7105 


853 


2639 


4425 


6211 1 


784CIP2B 538 


7105 


854 


2640 


4426 


6212 


784CIP2B^_539 


7109 


855 


2641 


4427 [ 


6213 


784CIP2B 540 


7109 


856 




4428 


6214 


784CIP2B_541 


7119 


857 


2643 


4429 


6215 




7120 


858 


2644 


4430 


6216 


784CIP2B_543 


7121 


859 


2645 


4431 j 


6217 


784CIP2B 544 


7126 


860 


" 26 , 46 i 


4432 


6218 


784CIP2B_545 


7127 


fi<5i 


2647 


4433 


6219 


784CIP2B 546 


7130 


862 


2648 


4434 


6220 


784CIP2B 547 


7131 


863 


2649 


4435 


6221 


784CIP2B 548 


7144 


864 


2650 


4436 


6222 


784CIP2B 549 


7159 


86*5 


2651 


4437 j 


6223 


784CIP2B 550 


7163 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




866 


2652 


4438 


6224 


784CIP2B_551 


7175 


867 


2653 


4439 


6225 


784CIP2B_552 


7188 


868 


2654 


4440 


6226 


784CIP2B 553 


7189 


869 


2655 


4441 


6227 


784CIP2B_554 


7190 


870 


2656 


4442 


6228 


784CIP2B_555 


7191 


! 871 


2657 


4443 


6229 


784CIP2B_556 


7203 


872 


2653 


4444 


6230 


784CIP2B_557 


7204 


873 


2659 


4445 


6231 


784CIP2B_558 


7208 


874 


2660 


4446 


6232 


784CIP2B_559 


7209 


875 


2661 


4447 


6233 


784CIP2B 560 


7210 


876 


2662 


4448 


6234 


784CIP2B_56l 


7216 


877 


2663 


4449 


6235 


784CIP2B 562 


7221 


878 


2664 


4450 


6236 


784CIP2B 563 


7230 


879 


2665 


4451 


6237 


784CIP2B_564 


7237 


880 


2666 


4452 


6238 


784CIP2B_565 


7240 


881 


2667 


4453 


6239 


784CIP2B_S66 


7245 


882 


2666 


4454 


6240 


784CIP2B_567 


7250 


883 


2669 


4455 


6241 


784CIP2B_568 


7251 


884 


2670 


4456 


6242 


784CIP2B 569 


7255 


885 


2671 


4457 


6243 


784CIP2B_570 


7260 


886 


2672 


4458 


6244 


784CIP2B 571 


7265 


887 


2673 


4459 


6245 


784CIP2B__572 


7268 


888 


2674 


4460 


6246 


784CIP2B_573 


7275 


889 


2675 


4461 


6247 


784CIP2B 574 


7279 


890 


2676 


4462 


6248 


784CIP2B_575 


7283 


891 


2677 


4463 


6249 


784CIP2B_576 


7283 


892 


2678 


4464 


6250 


784CTP2B_577 


7287 


893 


2679 


4465 


6251 


784CTP2B_578 


7301 


894 


2680 


4466 


6252 


784CIP2B_579 


7308 


895 


2681 


4467 


6253 


784CIP2B_580 


7308 


896 


2682 


4468 


6254 


784CIP2B 581 


7309 


897 


2683 


4469 


6255 


784CIP2B 5B2 


7319 


898 


2684 


4470 


6256 


784CIP2B 5B3 


7320 


899 


2685 


4471 


6257 


784CIP2B_584 


7326 


900 


2686 


4472 


6258 


784CIP2B 585 


7326 


901 


2687 


4473 


6259 


784CTP2B_586 


7334 


902 


2688 


4474 


6260 


784CIP2B 587 


7337 


903 


2689 


4475 


6261 


7B4CIP2B_588 


7339 


904 


2690 


4476 


6262 


784CIP2B 589 


7344 


905 


2691 


4477 


'6263 


784ClP2B_>9b 


7355 


906 


2692 ! 


4478 


6264 


784CIP2B_591 


7363 


907 


2693 


4479 


6265 


7B4CIP2B_592 


7363 


j 908 


.2694 


4480 


6266 


784CIP2B_593 


7365 


909 


2695 


4481 


6267 


784CIP2B_594 


7368 


910 


2696 


4482 


6268 


784CIP2B 595 


7369 


911 


2697 


4483 


6269 


784CIP2B_596 


7372 


912 


2698 


4484 


6270 


784CTP2B_599 


7375 


913 


2699 


4485 


6271 


784CIP2B 600 


7381 


914 


2700 


4486 


6272 


784CIP2B_601 


7383 


915 


2701 


4487 


6273 


784CIP2B_602 


7387 


916 


2702 


4488 


6274 


784CIP2B_603 


7391 


917 


2703 


4489 


6275 


784CIP2B_604 


7393 


918 


2704 


4490 


6276 


784CIP2B_605 


7395 


919 


2705 


4491 


6277 


784CIP2B_606 


7397 


920 


2706 


4492 


6278 


784CIP2B_607 


7399 


921 


2707 


4493 


6279 


784CIP2B_608 


7405 


922 


2708 


4494 


6280 


784CIP2B_609 


7406 


923 


2709 


4495 


6281 


784CIP2B_610 


7406 


924 


2710 


4496 


6282 


784CIP2B 611 


7409 


925 


2711 


4497 


6283 


784CIP2B 612 


7410 


926 


2712 


4498 


6284 


784CIP2B_613 


7411 


927 


2713 


4499 


6285 


784CIP2B 614 


7417 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




928 


2714 


4500 


6286 


784CIP2B_615 


7418 


929 


2715 


4501 


6287 


784CIP2B_6l6 


7421 


930 


2716 


4502 


6288 


784CIP2B_617 


7422 


931 


2717 


4503 


6289 


784CIP2B_6l8 


7422 


932 


2718 


4504 


6290 


784CIP2B_619 


7423 


933 


2719 


4505 


6291 


784CIP2B_620 


7424 


934 


2720 


4506 


[ 6292 


784CIP2B_621 


7426 


935 


2721 


4507 


6293 


784CIP2B 622 


7427 


936 


2722 


4508 


6294 


784CIP2B_623 


7428 


937 


2723 


4509 


6295 


784CIP2B 624 


7430 


938 


2724 


4510 


6296 


784C1P2B (52* 


7435 


939 


2725 


4511 


6297 


784CIP2B 626 


7437 


940 


2726 


4512 


6298 


784CIP2B_627 


7439 


941 


2727 


4513 


6299 


784CIP2B_628 


7440 


942 


2728 


4514 


6300 


784CIP2B_629 


7442 


943 


2729 


4515 


6301 


7B4CIP2B_630 


7450 


944 


2730 


4516 


i 6302 


784CIP2B_631 


7451 


945 


2731 


4517 


6303 


784CIP2B_632 


7452 


946 


2732 


4518 


6304 


784CIP2B_633 


7454 


947 


2733 


4519 


6305 


784CIP2B_634 


7457 


94 8 


2734 


4520 


6306 


784CIP2B_635 


7459 


949 


2735 


4521 


6307 


784CIP2B_636 


7461 


950 


2736 


4522 


6308 


784CIP2B_637 


7463 


951 


2737 


4523 


6309 


784CIP2B_638 


7466 


952 


2738 


4524 


6310 


784CIP2B_639 


7469 


953 


2739 


4525 


6311 


784CIP2B_640 


7473 


954 


2740 


4526 


6312 


784CIP2B_641 


7481 


955 


2741 


4527 


6313 


784CIP2B_642 


7482 


956 


2742 


4528 


6314 


784CIP2B_643 


7482 


957 


2743 


4529 


6315 


784CIP2B_644 


7483 


958 


2744 


4530 


6316 


784CIP2B_645 


7485 


959 


2745 


4531 


6317 


784CIP2B_646 


7486 


960 


2746 


4532 


6318 


784CIP2B_647 


7487 


961 


2747 


4533 


6319 


784CIP2B 648 


7491 


962 


2748 


4534 


6320 


784CIP2B_649 


7492 


963 


2749 


4535 


6321 


784CIP2B 650 


7494 


964 


2750 


4536 


6322 


784CIP2B_651 


749B 


965 


2751 


4537 


6323 


784CIP2B_652 


7504 


966 


2752 


4538 


6324 


784CIP2B_653 


7508 


967 


2753 


4539 


6325 


784CIP2B 654 


7516 


968 


2754 


4<>40 


£326; 


784CIP2B_655 


751B 


969 


2755 


4541 


6327 


784CIP2B 656 


7519 


970 


2756 


4542 


6328 


784CIP2B_657 


7521 


971 


2757 


4543 


6329 


784CIP2B_658 


7529 


972 


2758 


4544 


6330 


784CIP2B_659 


7532 


973 


2759 


4545 


6331 


784CIP2B_660 


7533 


974 


2760 


4546 


6332 


784CIP2B_661 


7535 


975 


2761 


4547 


6333 


784CIP2B_662 


7545 


976 


2762 


4548 


6334 


784CIP2B 663 


7546 


977 


2763 


4549 


6335 


784CIP2B_664 


7552 


978 


2764 


4550 


6336 


784CIP2B 665 


7554 


979 


2765 


4551 


6337 


784CIP2B_666 


7567 


980 


27SS 


4552 


6338 


784CIP2B_667 


7569 


981 


2767 


4553 


6339 


784CIP2B_668 


7575 


982 


2768 


4554 


6340 


784CIP2B_669 


7576 


983 


2769 


4555 


6341 


784CIP2B 670 


7577 


984 


2770 


4556 


6342 


784CIP2B_671 


7579 | 


985 


2771 


4557 


6343 


784CIP2B_672 


7582 


986 


2772 


4558 


6344 


784CIP2B_673 


7587 


987 


2773 


4559 


6345 


784CIP2B 674 


7589 


988 


2774 


4560 


£344 


"784C1P2B' £7* 


7597 


989 


2775 


4561 


6347 


784CIP2B_676 


7597 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


990 


z / / o 


a c CD 
ft DO Z 


634 8 


784CIP2B_677 


7609 




2 777 


4563 


6349 


784CIP2B 678 


7609 


! QQO 


O *7 "7 Q 
Z / / O 


4 Db4 


63 50 


784CIP2B 679 


7609 


qqi 


z / /y 


1 DO b 


6351 


784CIP2B 680 


7613 


Q QA 
7*t 


Z / a U 


4boo 


6352 


784CIP2B__681 


7623 




2781 


4567 


6353 


784CIP2B__682 


7629 




2 782 


4568 


6354 


7B4CIP2B_683 


7630 


997 


1TQ3 
Z / O J 


4bb y 


6355 


7B4CIP2B 684 


7633 


□ go 


Z / 


4570 


6356 


7B4CIP2B_685 


7635 


999 


2 785 


4571 


6357 


784CIP2B_686 


7638 


1000 


2786 


4572 


6358 


7B4CIP2B_687 


7639 


1UUX 


2787 


4573 


6359 


784CIP2B 688 


7646 


1002 


2788 


4574 


6360 


784CIP2B_689 


7647 


1003 


2789 


4575 


6361 


784CIP2B 690 


7648 


1004 


2790 


4576 


6362 


784CIP2B 691 


• 7658 


1005 


2791 


4577 


6363 


784CIP2B 692 


7664 


1006 


2792 


4578 


6364 


784CIP2B 693 


7664 


1007 


2793 


4579 


6365 


784CIP2B 695 


7674 • 


1 ftAQ 

lUUo 


2794 


4 580 


6366 


7B4CIP2B_696 


7675 


1009 


2795 


4581 


6367 


784CIP2B 697 


7676 


1U1U 


2796 


4582 


6368 


784CIP2B 698 


7681 


1U1 J. 


2797 


4583 


6369 


784CIP2B_599 


7688 


J. l/XZ 


2758 


4584 


6370 


784CIP2B_700 


7693 


mil 


2799 


4 585 


6371 


784CIP2B_701 


7694 


i n t a 


2 8 00 


4586 


6372 


784CIP2B 702 


7715 


lu lb 


2801 


45B7 


6373 


784CIP2BJ703 


7716 


1016 


2802 


4588 


6374 


784CIP2B_704 


7718 


1017 


2803 


45B9 


6375 


784CIP2BJ705 


7721 


1018 


2804 


4590 


6376 


784CIP2B 706 


7723 


1019 


2805 


4591 


6377 


784CIP2BJ707 


7729 


1020 


2 BOS 


4592 


6378 


784CIP2B_708 


7733 


1021 


2807 


4593 


6379 


784CIP2B_709 


7735 | 


JL UZ Z 


2 80 8 


4594 


6380 


784CIP2B 710 


7741 


1 "3 "3 


2809 


4595 


6381 


784CIP2B_711 


7743 




2810 


4596 


63B2 


784CIP2B_712 


7748 




2811 


4597 


6383 


784CIP2B_713 


7749 


1026 


2812 


4598 


6384 


784CIP2B 714 


7750 


1027 


2813 


4599 


6385 


784CIP2BJ715 


7757 


102 8 


2814 


4600 


6386 


784CIP2B_716 


7759 


1029 


2815 


4601 


6387 


784CIP2B__717 


7760 


1030 


2816 


4602 


6388 


784CIP2B_?18 


7760 


1031 


2817 


4603 


6389 


784CIP2B_719 


7764 


JLU oz 


2818 


4604 


6390 


784CIP2BJ720 


7765 


1033 


2819 


4605 


6391 


784CIP2B_721 


7766 


1U J4 


2820 


4606 


6392 


784CIP2B_722 


7767 


IUj j 


2821 


4607 


6393 


784CIP2B 723 


7769 


iUJ b 


Toon 

2822 


4608 


6394 


784CIP2B 724 


7770 


IUJ / 


z oZ J 


4609 


6395 


784CIP2B_725 


7774 


JLU J O 


2824 


4610 


6396 


784CIP2B 726 


7779 


1 nia 

J.U j y 


2825 . 


4611 


6397 


784CIP2B_727 


7781 


1040 


2826 


4612 


6398 


784CIP2B_728 


7782 


1041 


2827 


4613 


6399 


784CIP2B 729 


77B3 


1042 


2828 


4614 


6400 


784CIP2B 730 


7787 


1043 


Z OZ 3 


4615 


6401 


784CIP2B 731 


7792 


1044 


2830 


4616 


6402 


784CIP2B_732 


7795 


1045 


2831 


4617 


6403 


784CIP2B_733 


7801 


1046 


2832 


4618 


6404 


784CIP2B_734 


7807 


1047 


2833 


4619 


6405 


784CIP2B 735 


7808 


1048 


2834 


4620 


6406 


784CIP2B_736 


7819 


1049 


2835 


4621 


6407 


784CIP2B 737 


7824 


1050 


2836 


4622 


6408 


784CIP2BJ738 


7826 


1051 


" 2837 ■■" 


4623 


6409 


784CIP2B 739 


7829 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide . 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket number_ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488, 725 


1052 


2838 


4624 


6410 


784CIP2B_740 


7832 


1053 


2839 


4625 


6411 


784CIP2B 741 


7839 


1054 


2840 


4626 


6412 


784CIP2B_743 


7847 


1055 


2841 


4627 


6413 


784CIP2B_744 


7848 


1056 


2842 


4628 


6414 


784CIP2B_745 


7853 


1057 


2843 


4629 


6415 


784CIP2B 746 


7854 


1058 


2844 


4630 


6416 


784CIP2B_747 


7856 


1059 


2845 


4631 


6417 


784CIP2B 748 


7862 


1060 


2846 


4632 


6418 


784CIP2B 749 


7865 


1061 


2847 


4633 


6419 


784CIP2B_750 


7874 


1062 


2848 


4634 


6420 


784CIP2B 751 


7877 


1063 


2849 


4635 


6421 


784CIP2B 752 


7880 


1064 


2850 


4636 


6422 


784CIP2B_753 


7882 


1065 


2851 


4637 


6423 


784CIP2B_754 


7884 


1066 


2852 


4638 


6424 


784CIP2B_755 


7886 


1067 


2853 


4639 


6425 


784CIP2BJ756 


7888 


1068 


2854 


4640 


6426 


784CIP2BJ757 


7889 


1069 


2855 


4641 


6427 


784CIP2BJ758 


7901 j 


1070 


2856 


4642 


6428 


784CIP2B 759 


7910 


1071 


2857 


4643 


6429 


784CIP2B_760 


7911 


1072 


2858 


4644 


6430 


784CIP2B_761 


7921 


1073 


2859 


4645 


6431 


784CIP2BJ762 


7923 


1074 


2860 


4646 


6432 


784CIP2B 763 


7924 


1075 


2861 


4647 


6433 


784CIP2B_764 


7925 


1076 


2862 


4648 


6434 


784CIP2B_765 


7928 


1077 


2863 


4649 


6435 


784CIP2B_766 


7929 


1 1078 


2864 


4650 


6436 


7B4CIP2BJ767 


7930 


1079 


2865 


4651 


6437 


784CIP2B 768 


7934 


1080 


2866 


4652 


6438 


7B4CIP2B_769 


7938 


1081 


2867 


4653 


6439 


784CIP2B_770 


7942 


1082 


2868 


4654 


6440 


784CIP2B 771 


7945 


1083 


2869 


4655 


6441 


784CIP2BJ772 


7946 


10B4 


2870 


4656 


6442 


784CIP2B_773 


7948 


1085 


2871 


4657 


6443 


784CIP2B 774 


7951 


1086 


2 872 


4658 


6444 


784CIP2BJ775 


7952 


1087 


2873 


4659 


6445 


784CIP2BJ776 


7953 


1088 


2874 


4660 


6446 


784CIP2B 777 


7954 


1089 


2875 


4661 


6447 


784CIP2B_778 


7957 


1090 


2876 


4662 


6448 


784CIP2B 779 


7958 


1091 


2877 


4663 


6449 


784CIP2B 780 


7961 


1092 


287B 


4664 


6450 


784CIP2BJ7B1 


7965 


1093 


2879 


4665 


6451 


784CIP2B_7B2 


7966 


1094 


2880 


4666 


6452 


784CIP2BJ783 


7979 


1095 


2881 


4667 


• 6453 


784CIP2BJ784 


7986 


1096 


2882 


4668 


6454 


784CIP2B_785 


7986 


1097 


2883 


4669 


6455 


784CIP2B_786 


7988 


1098 


2884 


4670 


6456 


784CIP2B_787 


7991 


1099 


2885 


4671 


6457 


784CIP2B 788 


7992 


1100 


2886 


4672 


6458 


784CIP2B_789 


7992 


1101 


2887 


4673 


6459 


784CIP2B_790 


7992 


1102 


2888 


4674 


6460 


784CIP2BJ791 


7992 


1103 


2889 


4675 


6461 


784CIP2B 792 


8003 


1104 


2890 


4676 


6462 


784CIP2B 793 


8014 


1105 


2091 


4677 


6463 


784CIP2B_794 


8015 [ 


| 1106 


2892 


4678 


6464 


784CIP2BJ795 


8016 ! 


1107 


2893 


4479" '■ ■ 


6465 


784CIP2B 796 


8017 


1108 


.2894 


4680 


6466 


784CIP2B 797 


8019 


1109 


2895 


4681 


6467 


784CIP2B 798 


8020 


1110 


2896 


4682 


6468 


784CIP2B 799 


8022 


1111 


2897 


4683 


64^9 


784CIP2B 800 


8022 


1112 


2898 


4684 


6470 


784CIP2B_801 


8028 


1113 


2899 | 


4685 


6471 


7S4CIP2B_802 


8030 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO; 


SEQ ID 


Priority- 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488, 725 


sequence 


peptide 




sequence 


priority- 






sequence 






application 




1114 


2900 


4686 


6472 


784CIP2B_803 


8038 


1115 


2901 


4687 


6473 


784CIP2B 804 


8042 


1116 


2902 


4688 


6474 


784CIP2B 805 


8045 


1117 


2903 


4689 


6475 


784CIP2B__806 


8045 


1118 


2904 


4690 


6476 


784CIP2B_807 


8046 


1119 


2905 


4691 


6477 


784CIP2B_808 


8047 


1120 


2906 


4692 


6478 


784CIP2B 809 


8051 


1121 


2907 


4693 


6479 


784CIP2B 810 


8059 


1122 


2908 


4694 


6480 


784CIP2B 811 


8064 


1123 


2909 


4695 


6481 


784CIP2B 812 


8069 


1124 


2910 


4696 


6482 


784CIP2B 813 


8074 


1125 


2911 


4697 


6483 


784CIP2B 814 


8077 


1126 


2912 


4698 


6484 


784CIP2B 815 


8078 


1127 


2913 


4699 


6485 


784CIP2B_816 


8079 


1128 


2914 


4700 


6486 


7B4CIP2B 817 


8084 


1129 


2915 


4701 


6487 


7B4CIP2B_818 


8088 


1130 


2916 


4702 


6488 


7B4CIP2B__819 


8090 


1131 


2917 


4703 


6489 


784CIP2B_820 


8091 


1132 


2918 


4704 


6490 


7B4CIP2B_821 


8099 


1133 


2919 


4705 


6491 


784CIP2B_822 


8099 


1134 


2920 


4706 


6492 


784CIP2B 823 


8100 


1135 


2921 


4707 


6493 


784CIP2B 824 


6102 


1136 


2922 


4708 


6494 


784CIP2B_825 


8103 


1137 


2923 


4709 


6495 


784CIP2B_826 


8103 


1138 


2924 


4710 


6496 


784CIP2B_827 


8104 


1139 


2925 


4711 


6497 


784CIP2B_828 


0108 


1140 


2926 


4712 


6498 


784CIP2B_829 


8110 


1141 


2927 


4713 


6499 


784CIP2B_830 


8116 


1142 


2928 


4714 


6500 


784CIP2B_831 


8117 


1143 


2929 


4 715 


6501 


784CIP2B_832 


8123 


1144 


2930 


4716 


6502 


784CIP2B_833 


8130 


1145 


2931 


4717 


6503 


784CIP2B 834 


8130 


1146 


2932 


4718 


6504 


784CIP2B 835 


6143 


1147 


2933 j 


4719 


6505 


784CIP2B 836 


8143 


1148 


2934 


4720 


6506 


784CIP2B_837 


8154 


1149 


2935 


4721 


6507 


784CIP2B_838 


8155 


1150 


2936 


4722 


6508 


784CIP2B 839 


8162 


1151 


. 2937 


4723 


6509 


784CIP2B 840 


8163 


1152 


2938 


4724 


6510 


784CIP2B 841 


8172 


1153 


2939 


4725 


6511 


784CIP2B 842 


8173 


1154 


2940 


4726 


6512 


784CIP2B_843 


8179 


1155 


2941 


4727 


6513 


784CIP2B 844 


8182 


1156 


2942 


4728 


6514 


784CIP2B 845 


8183 


1157 


2943 


4729 


6515 


784CIP2B 846 


8184 


1158 


2944 


4730 


6516 


784CIP2B 847 


8185 


1159 


2945 


4 731 


6517 


784CIP2B 848 


8187 


1160 


2946 


4732 


6518 


784CIP2B 849 


3188 


1161 i 


2947 


4733 


6519 


784CIP2B 850 


8190 


1162 


2948 


4734 


6520 


784CIP2B_851 


8190 


1163 


2949 


4735 


6521 


784CIP2B 852 


8192 


1164 


2950 


4736 


6522 


784CIP2B_853 


8193 


1165 


2951 


4737 


6523 


784CIP2B 854 


8197 | 


1166 


2952 


4738 


6524 


784CIP2B_855 


8197 


1167 


2953 


4739 


6525 


784CIP2B 856 


8199 


1168 


2954 


4740 


6526 


784CIP2B 857 


8202 


1169 


2955 


4741 


6527 


784CIP2B_858 


8203 


1170 


2956 


4742 


6528 


784CIP2B 859 


8208 


1171 


2957 


4743 


. 6529 


784CIP2B 860 


8209 


1172 


2958 


4744 


6530 


784CIP2B 861 


8211 


1173 


295$ 


474* 


6531 


784CIP2B_862 


8214 


1174 


"■■ 2960 


4746 


6532 


784CIP2B 863 


8217 


1175 


2961 


4747 


6533 


784CIP2B 864 


8223 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number 


NO : in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S .S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




1176 


.2962 


4748 


6534 


784CIP2B_B65 


8224 


1177 


2963 


4749 


6535 


784CIP2B_866 


8226 • 


1178 


2964 


4750 


6536 


784CIP2B 867 


0227 


1179 


2965 


4751 


6537 


784CIP2B_868 


8229 


1180 


2966 


4752 


6538 


784CIP2B_869 


8232 


1181 


2967 


4753 


6539 


784CIP2B_870 


8236 


1182 


; 2968 


4754 


6540 


784CIP2B_871 


8239 


1183 


2969 


4755 


6541 


784CIP2B_872 


8244 


1184 


2970 


4756 


6542 


784CIP2B 873 


8245 


1185 


2971 


4757 


6543 


784CIP2EI 874'"-' 


8248 


1186 


2972 


4758 


6544 


784CIP2B_875 


8251 


1187 


2973 


4759 


6545 


784CIP2B_876 


8253 


1188 


2974 


4760 


6546 


784CIP2B_877 


8260 


1189 


2975 


4761 


6547 


784CIP2B_878 


8262 


1190 


2976 


4762 


6548 


784CIP2B_879 


8268 


1191 


2977 


4763 


6549 


784CIP2B_BB0 


8270 


1192 


2978 


4764 


6550 


784CIP2B_8B1 


8272 


1193 


2979 


4765 


6551 


784CIP2B__882 


8274 


1194 


2980 


4766 


6552 


784CIP2B_883 


8274 


1195 


2981 


4767 


6553 


784CIP2B_884 


8275 


1196 


2982 


4768 


6554 


784CIP2B_885 


8277 


1197 


2983 


4769 


6555 


784CIP2B_886 


8281 


1198 


2984 


4770 


6556 


784CIP2B_887 


8283 


1199 


2985 


4771 


6557 


784CIP2B 888 


8289 


1200 


2986 


4772 


6558 


784CIP2B_889 


8295 


1201 


2987 


4773 


6559 


7B4CIP2B 890 


8300 


1202 


2988 


4774 


6560 


784CIP2B_891 


8303 


1203 


2989 


4775 


6561 


7B4CIP2B_892 


8304 


1204 


2990 


4776 


6562 


7B4CIP2B_893 


8305 


1205 


2991 


4777 


6563 


7B4CIP2B_894 


8309 


1206 


2992 


4778 


6564 


784CIP2B_895 


8318 


1207 


2993 


4779 


65*5 


784CIP2B 896 


8319 


1208 


2994 


4780 


6566 


784CIP2B_897 


8321 


1209 


2995 


4781 


6567 


784CIP2B_898 


8322 


1210 


2996 


4782 


6568 


784CIP2B 899 


8323 


1211 


2997 


4783 


65^9 


784CIP2B_900 


8325 


1212 


2998 


4784 


6570 


784CIP2B 901 


8331 


1213 


2999 


4785 


6571 


784CIP2B 902 


8332 


1214 


3000 


4786 


6572 


784CIP2B_903 


8333 


1215 


3001 


4187 


6573 


784CIP2B_904 


"" B335 


1216 


3002 


4788 


6574 


784CIP2B_905 


8336 


1217 


3003 


4789 


6575 


784CIP2B_906 


8337 


1218 


3004 


4790 


6576 


784CIP2B 907 


8340 


1219 


3005 


4791 


6577 


784CIP2B 908 


8343 


1220 


3006 


4792 


6578 


784CIP2B_909 


8347 


1221 


3007 


4793 


6579 


784CIP2B_910 


8349 


1222 


3008 


4794 


6580 


784CIP2B 911 


8351 


1223 


3009 


4795 


6581 


784CIP2B_912 


8353 


1224 


3010 


4796 


6582 


784CIP2B_913 


8355 


1225 


3011 


4797 


6583 


784CIP2B_914 


8361 


1226 


3012 


4798 


6584 


784CIP2B_915 


8365 


1227 


3013 


4799 


6585 


784CIP2B 916 


8367 


1228 


3014 


4800 


6586 


784CIP2B 917 


8369 | 


1229 


3015 


4801 


6587 


784CIP2B_919 


B375 


123 0 


3016 


4802 


6588 


784CIP2B_920 


8387 


1231 


3017 


4803 


6589 


784CIP2B_921 


8391 


1232 


3018 


4804 


6590 


784CIP2B_922 


8393 


1233 


3019 


4805 


6591 


784CIP2B_923 


8393 


1234 


3020 


4806 


6592 


784CIP2B_924 


8394 


1235 


3021 


4807 


6593 


784CIP2B 925 


8395 


1236 


3022 


4808 


£594 


784CIP2B_926 


™ 8396 


1237 


3023 


4809 


6595 


784CIP2B_927 


8398 



290 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1238 


3024 


4810 


6596 


784CIP2B_928 


8402 


1239 


3025 


4811 


6597 


784CIP2B 929 


8402 


1240 


3026 


4812 


6598 


784CIP2B_930 


8405 


1241 


3027 


4813 


6599 


784CIP2B 931 


8406 


1242 


3028 


4814 


6600 


784CIP2B 932 


8409 


1243 


3029 


4815 


6601 


784CIP2B_933 


8410 


1244 


3 030 


4816 


6602 


784CIP2B_934 


8414 . 


1245 


3031 


4817 


6603 


784CIP2B 935 


8415 


1246 


3032 


4818 


6604 


7B4CIP2B_936 


8419 


1247 


3033 


4319 


6605 


784CIP2B_937 


8426 


1248 


3034 


4820 


6606 


784CIP2B_938 


8430 


1249 


3035 


4821 


6607 


784CIP2B 939 


8431 


1250 


3036 


4822 


6608 


784CIP2B_940 


8432 


1251 


3037 


4823 


6609 


784CIP2B_941 


8433 


1252 


3038 


4824 


6610 


784CIP2B_942 


8434 


1253 


3039 


4825 


6611 


784CIP2B_943 


B438 


1254 


3040 


4826 


6612 


784CIP2B_944 


8439 


1255 


3041 


4827 


6613 


784CIP2B_945 


8441 


1256 


3042 


4828 


6614 


784CIP2B_946 


8450 


1257 


3043 


4829 


6615 


784CIP2B 947 


8451 


1258 
— — 1 '."s - _ 


3044 


4830 


6616 


784CIP2B_948 


8452 


1259 


3045 


4831 


6617 


784CIP2B_949 


8460 


1260 


3046 


4832 


6618 


784CIP2B_950 


8461 


1261 


3047 


4933 


6619 


784CIP2B 951 


8462 


1262 


3048 


4834 


6620 


784CIP2B 952 


8464 • 


1263 


3049 


4835 


6621 


784CIP2B_953 


8465 


1264 


3050 


4836 


6622 


784CIP2B_954 


8467 


1265 


30S1 


4837 


6623 


784CIP2B_955 


8470 


1266 


3052 


4838 


6624 


i 784CIP2B 956 


8471 


1267 


3053 


4839 


6625 


784CIP2B_957 


8473 


1268 


3054 


4840 


6626 


784CIP2B_958 


B474 | 


1269 


3055 


4841 


6627 


784CIP2B_9S9 


B475 


1270 


3056 


4842 


6628 


784CIP2B_960 


8476 


1271 


3057 


4843 


6629 


784CIP2B_961 


8480 


1272 


3058 


4844 


6630 


784CIP2B_962 


8482 


1273 


3059 


4845 


6631 


784CIP2B 963 


8482 


1274 


3060 


4846 


6632 


784CIP2B 9*4 


8486 


1275 


3061 


4847 


6633 


784CIP2B_965 


8468 


1276 


3062 


4848 


6634 


784CIP2B_966 


8492 


1277 


3063 


4849 


6635 


784CIP2B_967 


8494 


1278 


3064 


4850 




7B4CIP2B_968 


8496 


1279 


3065 


4851 


6637 


784CIP2B_969 


8497 


1280 . 


3066 


4852 


6638 


784CIP2B_970 


8499 


1281 


3067 


4853 


6639 


784CIP2B_971 


8513 


1282 


3068 


4854 


6640 


784CIP2B 972 


8522 


1283 
""" i a ; 


3069 | 


4855 


6641 


784CIP2B_973 


8526 


1284 


3070 


4856 


6642 


784CIP2B 974 


8531 


1285 


3071 


4857 


6643 


784CIP2B_975 


8533 


1286 


3072 


4858 


6644 


784CIP2B_976 


8542 


1287 


3073 


4859 


6645 


784CIP2B_977 


8544 | 


1288 


3074 


4860 


6646 


784CIP2B_978 


8565 


1289 


3075 


4861 


6647 


784CIP2B__979 


8565 


1290 


3076 


4862 


6648 


784CIP2B 980 


8572 




3077 


4863 


6649 


784CIP2B 981 


8576 


1292 


3078 


4864 


6650 


784CIP2B_982 


8578 


1293 


3079 


4865 


6651 


784CIP2B_9S3 \ 


8584 


1294 


3080 


4866 


6652 


784CIP2B_984 


8598 


1295 


3081 


4867 


6653 


784CIP2B 985 


8602 


1296 


3082 


4868 


6654 


784CIP2B_986 


8604 ! 


1297 


3083 


4869 


6655 


784CIP2B_987 


8609 


1298 


3084 


4870 


6656 


784CIP2B 988 


8612 


1299 


3085 


4871 


6657 


784CIP2B 989 


8637 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


13 00 


3 086 


4872 


6658 


! 7B4CIP2B_990 


6640 | 


1301 


3087 


4873 


6659 


7B4CIP2B_991 


8643 | 


13 02 


3088 


4874 


6660 


7B4CIP2B_992 


8645 


13 03 


3089 


4875 


6661 


784CIP2B 993 


8650 


13 04 


3090 


4876 


6662 


784CIP2B 994 


8651 


1305 


3091 


4877 


6663 


784CIP2B 995 


8654 


1306 


3092 


4878 


6664 


784CIP2B 996 


8£55 


1307 


3093 


4879 


6665 


784CIP2B 997 


8657 


1308 


3094 


4880 


6666 


784CIP2B_998 


8665 


1309 


3095 


4881 


6667 


784CIP2B 999 


8668 


1310 


3096 


4882 


6668 


784CIP2B 1000 


8671 


1311 


3097 


4883 


6669 


784CIP2B_1001 


8672 


1312 


3098 


4884 


6670 


784CIP2BJL002 


8692 


1313 


3099 


4885 


6671 


784CIP2B 1003 


8706 


1314 


3100 


4886 


6672 


784CIP2BJL004 


8716 


^ 1315 


3101 


4887 


6673 


784CIP2B 1005 


8719 


1316 


3102 


4888 


6674 


784CIP2B_1006 


8743 


1317 


3103 


4889 


6675 


784CIP2B_1007 


8764 


1318 


3104 


4890 


6676 


784CIP2B_1008 


8764 


1319 


3105 


4891 


6677 


784CIP2B_1009 


8764 


1320 


3106 


4892 


6678 


784CIP2BJL010 


8774 


1321 


3107 


4893 


6679 


784CIP2B_1011 


8782 


1322 


3108 


4894 


6680 


784CIP2B 1012 


8796 


1323 


3109 


4895 


6681 


784CIP2B_1013 


8827 


1324 


3110 


4896 


6682 


784CIP2B_1014 


8842 


1325 


3111 


4897 


6683 


784CIP2B_1015 


8842 


1326 


3112 


4898 


6684 


784CIP2B_1016 


8858 


1327 


3113 


4899 


6685 


784CIP2B 1017 


8871 


1328 


3114 


4900 


6686 


784CIP2B 1018 


8921 


I 1329 


3115 


4901 


6687 


784CIP2B 1019 


8927 


1330 


3116 


4902 


6688 


784CIP2B_1020 


8942 


1331 


3117 


4903 


66B9 


784CIP2B 1021 


8994 


1332 


3118 


4904 


££90 


784CIP2B_lb22 


9023 


1333 


3119 


4905 


6691 


784CIP2B_1023 


9028 


1334 


3120 


4906 


6692 


784CIP2B 1024 


9058 


1335 


3121 


4907 


6693 


784CIP2B 1025 


9058 


1336 


3122 


4908 


6694 


784CIP2B 1026 


9079 


1337 


3123 


4909 


6695 


784CIP2B__1027 


9079 


1338 


3124 


4910 


6696 


784CIP2B_1028 


9082 


133 9 


3125 


4911 


6697 


784CIP2B 1029 


90B4 


1340 


3126 


4912 


669B 


784Ci:Pib_1030■ 


9093 


1341 


3127 


4913 


6699 


784CIP2B_1031 


9101 


1342 


3128 


4914 


6700 


784CIP2B1032 


9103 


1343 


3129 


4915 


6701 


784CIP2B_1033 


9105 


1344 


3130 


4916 


6702 


784CIP2B_1034 


9151 


1345 


3131 


4917 


6703 


7B4CIP2B 1035 


9161 


1346 


3132 


4918 


6704 


784CIP2B_1036 


9172 


1347 


3133 


4919 


6705 


784CIP2B_1037 


9174 | 


1348 


3134 


4920 


6706 


784CIP2B_1038 


9204 


1349 


3135 


4921 


6707 


784CIP2B 1039 


9234 


1350 


3136 


4922 


6708 


784CIP2B_1040 


9235 


1351 


3137 


4923 


6709 


784CIP2B_1041 


9239 


1352 


3138 


4924 


6710 


784CIP2BJ1042 


9256 


1353 


3139 


4925 


6711 


784CIP2B_1043 


9276 


1354 


3140 


4926 


6712 


784CIP2B_1044 


9345 


1355 


3141 


4927 


6713 


784CIP2B_1045 


9379 


1356 


3142 


4928 


6714 


784CIP2B_1046 


9435 


1357 


3143 


4929 


6715 


784CIP2B_1047 


9437 


1358 


3144 


4930 


6716 


784CIP2B_1048 


9469 ; 


1359 


3145 


4931 


6717 


7B4CIP2B 1049 


9500 


1360 


3146 


4932 


6718 


784CIP2B 1050 j 


9502 


1361 


3147 


4933 


6719 


784CiP2B 10£l 


9520 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number^ 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 




sequence 






application 




1 "j^o 

Xjdz 


3 14 8 


4934 


672 0 


784CIP2B_1052 ' 


9541 




3 149 


4935 


6721 


784CIP2B_1053 


9541 




3 150 


4936 


6722 


784CIP2B_1054 


954B 


1 job 


3151 


4937 


6723 


784CIP2B_1055 


9556 


X .i b b 


3 152 


493 8 


6724 


784CIP2B_1056 


9556 


1 TfT 

Xjd / 


3 153 


4939 


6725 


784CIP2B_1057 


9575 




■a 1 CLA 
J 1d4 


494 0 


6726 


784CIP2B_1058 


9589 


l ICQ 


3155 


4941 


6727 


784CIP2B_1059 


9599 


inn 


3 156 


4942 


6728 


784CIP2B_1060 


9602 


LA t X 


3 157 


4943 


6729 


784CIP2B_1061 


9606 


1 IT) 
XJ /Z 


3156 


4944 


6730 


784CIP2B 1062 


9622 


Xj / j 


3159 


4945 


6731 


784CIP2B_1063 


9623 


1 374 


*i i e n 
J lb U 


4946 


6732 


784CIP2B_1064 


9646 




3161 


4947 


6733 


784CIP2B_1065 


9747 


1376 


3162 


4948 


6734 


784CIP2B 1066 


9773 


1377 


3 163 


4949 


6735 


784CIP2B_JL()67 


9785 


1378 


3164 


4950 


6736 


784CIP2B 106B 


9801 


1379 


3165 


4951 


6737 


784CIP2B 1069 


9811 


XJoU 


3166 


4952 


673 8 


784CIP2B_1070 


9843 


1381 


3 167 


'4953 


6739 


784CIP2B_1071 


9854 


1382 


3168 


4954 


6740 


784CIP2B_1072 


9854 


13 83 


3169 


4955 


6741 


784CIP2B_1073 


9864 


1384 


3170 


4956 


6742 


784CIP2B 1074 


9864 j 


1385 


3171 


4957 


6743 


784CIP2B_1075 


9871 


13 86 


3172 


4958 


6744 


784CIP2B_1076 


9879 


13 87 


3173 


4959 


6745 


784CIP2B 1077 


9881 


H i a n 


3174 


4960 


6746 


784CIP2B 1078 


9885 | 


13 89 


3175 


4961 


6747 


784CIP2B_1079 


9901 


13 90 


3176 


4962 


6748 


784CIP2B_1080 


9912 


1391 


3177 


4963 


6749 


784CIP2B_1081 


9916" 




3178 


4964 


6750 


784CIP2B 1082 


9921 


13 93 


3179 


4965 


6751 


784CIP2BJL083 


9925 


1 1 OA 

1 j 94 


3180 


4966 


6752 


784CIP2B_1084 


9930 


1395 


3161 


4967 


6753 


784CIP2B 1085 


9949 


1396 


3182 


4968 


6754 


784CIP2BJL086 


9951 


13 97 


3183 


4969 


6755 


784CIP2B_1087 


9959 


13 98 


3184 


4970 


6756 


784CIP2B_1088 


9973 


1 1QQ 
XJ73 


3185 


4971 


6757 


784CIP2B 1089 


9982 


1400 


3186 


4972 


6758 


784CIP2B_1090 


9994 


14 01 


3187 


4973 


6759 


784CIP2B_1091 


10021 


14 


3188 


4974 


6760 


784CIP2BJL092 


10041 


14 


3189 


4975 


6761 


784CIP2BJL094 


10067 


J. ft u 4 


3190 


4 976 


6762 


784CIP2B_1095 


10073 




3191 


4977 


6763 


784CIP2B 1096 


10112 


14 uo 


3192 


4978 


6764 


784CIP2B_JL097 


10117 


14U / 


3193 


4979 


6765 


784CIP2B_1098 


10132 


14 U o 


3194 


4980 


6766 


784CIP2B_1099 


1016*4 


140 9 


3195 


4981 


6767 


784CIP2B 1100 


10217 


1/1 A 

±4 ±U 


3196 


4 982 


6768 


7B4CIP2B_1101 


10226 


1411 


3197 


4983 


6769 


784C1P2B 1102 


10232 


1412 


3198 


4984 


6770 


784CIP2B 1103 


10237 


1413 


3199 j 


4985 


6771 


784CIP2B_1104 


10279 


1414 


3200 


4986 


6772 


784CIP2C 1 


33 


1415 


i s> n i 1 


4987 


6773 


784CIP2C 2 


271 


1416 


3202 


4988 


6774 


784CIP2C 3 


848 


1417 


3203 


4989 


6775 


784CIP2C 4 


849 


1418 


3204 


4990 


6776 


784CIP2C 5 


864 


1419 


3205 


4991 


6777 


784CIP2C 6 


953 


1420 


3206 


4992 


6778 


784CIP2C 7 


980 


1421 


3207 


4993 


6779 


784CIP2C 8 


1595 


1422 


3208 


4994 


6780 


784CIP2C 9 


1697 


1423 


3209 


4995 


6781 


784CIP2C 10 


1744 



293 



WO 01/53312 



PCT/US00/34263 



SEQ ID NO: 

of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1 1424 


3210 


4996 


6782 


784CIP2C 11 


1937 


1425 


3211 


4997 


6783 


784CIP2C 12 


1955 


1426 


3212 


4998 


6784 


784CIP2C__13 


1955 


1427 


3213 


4999 


6785 


784CIP2C 14 


2185 


1428 


3214 


5000 


6786 


784CIP2C 15 


2889 


1429 


3215 


5001 


6787 


784CIP2C 16 


2901 


1430 


3216 


5002 


6788 


784CIP2C_17 


2902 


1431 


3217 


5003 


6789 


784CIP2C 18 


2905 


1432 


3218 


5004 


6790 


784CIP2C 19 


2948 


1433 


3219 


5005 


6791 


784CIP2C_20 


2956 


1434 


3220 


5006 


6792 


784CIP2C 21 


2959 


143 5 


3221 


5007 


67S3 


784CIP2C 22 


2965 


1436 


3222 


5008 


6794 


784CIP2C 23 


2966 


1437 


3223 


5009 


6795 


784CIP2C_24 


2970 


1438 


3224 


5010 


6796 


784CIP2C 25 


2985 


1439 


3225 


5011 


6797 


784CIP2C 26 


2987 


1440 


3226 


5012 


6798 


784CIP2C 27 


2993 


1441 


3227 


5013 


6799 


7B4CIP2C_28 


2993 


1442 


3228 


5014 


6800 


784CIP2C 29 


3017 


1443 


3229 


5015 


6801 


784CIP2C_3 0 " 


3046 


1444 


3230 


5016 


6802 


784CIP2C 31 


3050 


1445 


3231 


5017 


6803 


7B4CIP2C_32 


3357 


1446 


3232 


5018 


6804 


784CIP2C_33 


3359 


1447 


3233 


5019 


6805 


784CIP2C 34 


3432 


1448 


3234 


5020 


6806 


784CIP2C_35 


3438 


1449 


3235 


5021 


6807 


7B4CIP2C_36 


3439 


1450 


3236 


5022 


6808 


784CIP2CJ39 


3463 


1451 


3237 


5023 


6809 


784CIP2C 40 


3466 


1452 


3238 


5024 


6810 


784CIP2C 41 


3466 


1453 


3239 


5025 


6B11 


784CIP2C_42 


3467 


1454 


3240 


5026 


6812 


784CIP2C 43 


3468 


1455 


3241 


5027 


6813 


784CIP2C 44 


3483 


1456 


3242 


5028 


6814 


784CIP2C_45 


3484 


1457 


3243 


5029 


6815 


784CIP2C 46 


3488 


1458 


3244 


5030 


6816 


784CIP2C_47 


3491 


1459 


3245 


5031 


6817 


784CIP2C_48 


3493 


1460 


3246 


5032 


6818 


7B4CIP2C 49 


3494 


1461 


3247 


5033 


6819 


784CIP2C_50 


3495 


1462 


3248 


5034 


6820 


784CIP2C_51 


3496 


1463 


3249 


5035 


6821 


784CIP2C 52 


3503 


1464 


3250 


5036 


6822 


784CIP2C_53 


3503 


1465 


3251 


5037 


6823 


784CIP2C 54 


3504 


1466 


3252 


5038 


6824 


7B4CIP2C 55 


3511 


1467 


3253 


5039 


6825 


784CIP2C 5.6 


3531 


1468 


3254 


5040 


6826 


784CIP2C 57 


3536 


1469 


3255 


5041 


6827 


784CIP2C_58 


3546 


1470 


3256 


5042 


6828 


784CIP2C 59 


3548 


1471 


3257 


5043 


6829 


784CIP2C 60 


3551 


1472 1 


3258 


5044 


6830 


784CIP2C 61 


3553 


1473 


3259 


5045 


6831 


784CIP2C 62 


356^4 


1474 


3260 


5046 


£832 


784CIP2C 63 


3567 


1475 


3261 


5047 


6833 


784CIP2C 64 


3572 


1476 


3262 


5048 


6834 


784CIP2C 65 


3573 


1477 


3263 


5049 


6835 


784CIP2C 66 


3574 


1478 


3264 


5050 


6836 


784CIP2C 67 


3583 


""■ 1479 


3265 


5051 


6837 


784CIP2C 68 


3615 


1480 


3266 


5052 


6838 


784CIP2C_69 


3623 


1481 


3267 


5053 


6839 


784CIP2C 70 


3629 


1482 


3268 


5054 


6840 


784CIP2C 71 


3666 


1483 


3269 


5055 


6841 


784CIP2C 72 


3667 


1484 


3270 


5056 


6842 


784CIP2C 73 


3906 


1485 


3271 


5057 


6843 


784CIP2C__74 


3912 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number 


NO: in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 
sequence 




sequence 


priority 
application 




1485 


3272 


5058 


6844 


784CIP2CJ75 


3924 


1487 


3273 


5059 


£845 


784CIP2C 76 


3928 


1488 


3274 


S060 


6846 


784CIP2C_77 


3535 


1489 


3275 


5061 


6847 


784CIP2CJ78 


3959 


1490 


3276 


5062 


6848 


784CIP2C_79 


39B1 


1491 


3277 


5063 


6849 


j 784CIP2C_80 


3989 


1492 


3278 


5064 


6850 


784CIP2C 81 


4295 


1493 


3279 


5065 


6851 


784CIP2C_82 


4300 


1494 


3280 


5066 


6852 


784CIP2C 83 


4360 


1495 


3281 


5067 


6853 


784CIP2C 84 


436*2 


1496 


3282 


506B 


6854 


784CIP2C_85 


i 4371 


1497 


3283 


5069 


6855 


784CIP2C_S6 


4373 


1498 


3284 


5070 


6856 


784CIP2C_87 


j 4376 


1499 


3285 


5071 


6857 


784CIP2C_89 


4378 


1500 


3286 


5072 


6858 


| 784CIP2C_90 


4382 


1501 


3287 


5073 


6859 


784CIP2C_91 


4409 


1502 


3288 


5074 


6860 


784CIP2C_92 


4421 


1503 


3289 


5075 


6861 


784CIP2C__93 


4421 


1504 


3290 


5076 


6862 


784CIP2C_94 


4426 


1505 


3291 


5077 


6863 


784CIP2C_95 


4430 


1506 


3292 


5078 


6864 


784CIP2C_96 


4435 


1507 


3293 


5079 


6865 


784CIP2C__97 


4436 


1508 


3294 


5080 


6866 


784CIP2C 98 


4439 


1509 


3295 


5081 


6B67 


784CIP2C_99 


4440 


1510 


3296 


5082 


6868 


784CIP2C_100 


4441 


1511 


3297 


50B3 


6B69 


784CIP2C_101 


4442 


1512 


3298 


5084 


6870 


784CIP2C_102 


4455 


1513 


3299 


5085 


6871 


784CIP2C_103 


4462 


1514 


3300 


5086 


6B72 


784CIP2C_104 


4466 


1515 


3301 


5087 


6873 


784CIP2C_105 


4469 


1516 


3302 


5088 


6874 


784CIP2C_106 


4477 


1517 


3303 


5089 


6875 


784CIP2C_107 


4481 


1518 


3304 


5090 


6876 


784CIP2C_108 


4483 


1519 


3305 


5091 


6877 


784CIP2C_109 


4484 


1520 


3306 


5092 


6878 


784CIP2C 110 


4486 


1521 


3307 


*093 


6879 


784CIP2C 111 


4490 


1522 


3308 


S094 


6880 


784CIP2C_112 


4499 


1523 


3309 


5095 


6881 


784CIP2C_113 


4503 


1524 


3310 


5096 


6882 


784CIP2C_114 


4506 


1*25 


3311 


fc0£7 


<J8B3 


784CIP2C_115 


4509 


1526 


3312 


5098 


6884 


784CTP2C_116 


4514 


1527 


3313 


5099 


68B5 


784CIP2CJL17 


4516 


1528 


3314 


5100 


6886 


784CIP2C_118 


4522 


1529 


3315 


5101 


68B7 


784CIP2C_119 


4525 


1530 


3316 


5102 


68B8 


784CIP2C_120 


4527 


1531 


3317 


5103 


6889 


784CIP2C_121 


4528 


1532 


3318 


5104 


6890 


784CIP2C_122 


4529 | 


1533 


3319 


5105 


6891 


784CIP2CJL23 


4532 


1534 


3320 


5106 


6892 


7B4CIP2C_124 


4537 


1535 


3321 


5107 


6893 


784CIP2C_125 


4538 


1536 


3322 


5108 


6894 


784CIP2CJL26 


4551 


1537 


3323 


5109 


6895 


784CIP2C_127 


45$2 


1538 


3324 


5110 


6896 


784CIP2C 128 


4559 | 


1539 


3325 


5111 


6897 


784CIP2C_129 


4567 


1540 


3326 


5112 


6898 


784CIP2C_130 


4568 


1541 


3327 


5113 


6899 


784CIP2C_132 


4585 


1542 


3328 


5114 


6900 


7.84CIP2C_133 


4592 


1543 


3329 


5115 


6901 


784CIP2C 134 


4609 


1544 


3330 


5116 


6902 


784CIP2C_135 


4616 


1545 


3331 


5117 


6903 


784CIP2C 136 


4617 


1546 


3332 


5118 


6904 


784CIP2C_137 


4618 


1547 


3333 


5119 


6905 


784CIP2C_138 


4620 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SE£ ID 


of full- 


NO: of 


of contig 


NO: 


do c k e t numbe r 


NO: in 


length 


full- 


nucleotide 


of contig 


cor re sponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




1548 


3334 


5120 


6906 


784CIP2CJ.39 


4624 


1549 


3335 


5121 


6907 


784CIP2C_140 


4632 


1550 


3336 


5122 


6908 


784CIP2C_141 


4634 


1551 


3337 


5123 


6909 


784CIP2C_142 


4638 


1552 


3338 


5124 


6910 


784CIP2C_143 


4639 


1553 


3339 


5125 


• 6911 


784CIP2C_JL44 


4643 


1554 


3340 


5126 


6912 


784CIP2C 145 


4644 


1555 


3341 


5127 


6913 


784CIP2C_146 


4655 


1556 


3342 


5128 


6914 


784CIP2C_147 


4668 


1557 


3343 


5129 


6915 


784CIP2C_148 


4677 


155B 


3344 


5130 


6916 


784CIP2C 149 


4677 


1559 


3345 


. 5131 


6917 


784CIP2CJ.50 


4677 


1560 


3346 


5132 


6918 


784CIP2C_152 


4682 


1561 


3347 


5133 


6919 


784CIP2CJL53 


4690 


15<J2 


3348 


5134 


6920 


784CIP2CJL54 


4691 


1563 


3349 


5135 


6921 


784CIP2C_15S 


4727 


1564 


3350 


5136 


6922 


784CIP2C_156 


4730 


1565 


3351 


5137 


6923 


784CIP2C__157 


4734 | 


1566 


3352 


5138 


£924 


784CIP2C_158 


4757 


1567 


3353 


5139 


6925 


784CIP2CJ.59 


4764 


1568 


3354 


5140 


6926 


784CIP2C_160 


4786 


1569 


3355 


5141 


6927 


784CIP2C_161 


4793 


1570 


3356 


5142 


6928 


784CIP2C_162 


482* 1 


1571 


3357 


5143 


6929 


784CIP2C 163 


4626 


1572 


3358 


5144 


6930 


784CIP2CJL64 


4850 


1573 


3359 


5145 


6931 


784CIP2C_165 


4853 


1574 


3360 


5146 


6932 


784CIP2CJ.66 


4855 | 


1575 


3361 


5147 


6933 


• 784CIP2C 167 


4856 


1576 


3362 


5148 


6934 


784CIP2CJL68 


4867 


1577 


3363 


5149 


6935 


784CIP2C_169 ' 


4869 


1578 


3364 


5150 


6936 


784CIP2C 170 


4878 


1579 


3365 


5151 


6937 


784CtP2C_171 


4880 


1580 


3366 


5152 


6938 1 


784CIP2CJ.72 


4942 


1581 


3367 


5153 


6939 


784CIP2C_173 


4945 


1582 


3368 


5154 


6940 


784CIP2C 174 


4950 


1583 


3369 


5155 


■6941 


784CIP2C_17* 


4952 


1584 


3370 


5156 


6942 


784C1P2C 176 


4954 


1585 


3371 


5157 


6943 


784CIP2C_177 


4958 


1586 


3372 


5158 


6944 


784CIP2CJL78 


4961 


1587 


3373 


5159 


6945 


784CIP2C 179 


5590 


1588 


3374 


5160 


6946 


784CIP2C 180 


5599 


1589 


3375 


5161 


6947 


784CIP2CJL81 


5692 


1590 


3376 


5162 


6948 


784CIP2C_182 


5732 


1591 


3377 


5163 


6949 


784CIP2C_183 


5765 


1592 


3378 


5164 


6950 


784CIP2CJL84 


5771 


1593 


3379 


5165 


6951 


784CIP2C_185 


5774 


1594 


3380 


5166 


6952 


784CIP2CJL86 


5793 


1595 


3381 


5167 


6953 


784CIP2C_1B7 


5806 


159* 


3382 


5168 


6954 


784CIP2C_188 


5852 


1597 


3383 


5169 


6955 


784CIP2CML89 


5892 


1598 


3384 


5170 


6956 


784CIP2C_190 


6057 


1599 


3385 


5171 


6957 


784CIP2C_191 


6061 


1600 


3386 


5172 


6958 


784CIP2C_192 


6109 


1601 


3387 


5173 


6959 


784CIP2C_193 


6160 


1602 


3388 


5174 


6960 


784CIP2C 194 


6297 


1603 


3389 


5175 


6961 


784CIP2CJL95 


6398 


1604 


3390 


5176 


6962 


784CIP2C_196 


£398 " 


1605 


3391 


5177 


6963 


784CIP2C 197 


6415 


1606 


3392 


5178 


6964 


784CIP2C_198 


6448 


1607 


3393 


5179 


6965 


784CIP2C_199 


6469 


1608 


3394 


5180 


6966 


7B4CIP2C_200 


£47* 


1609 


3395 


5181 


6967 


784CIP2C_201 


6561 
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SEQ ID NO: 


SEQ ID 


SEQ ID NO: 


SEQ ID 


Priority 


SEQ ID 


of full- 


NO: of 


of contig 


NO: 


docket number_ 


N0:in 


length 


full- 


nucleotide 


of contig 


corresponding 


U.S. S.N. 


nucleotide 


length 


sequence 


peptide 


SEQ ID NO: in 


09/488,725 


sequence 


peptide 




sequence 


priority 






sequence 






application 




1610 


3396 


5182 


6968 


784CIP2C_202 


6574 


1611 


3397 


5183 


69S9 


784CIP2C 203 


6578 


1612 


3398 


5184 


6970 


784CIP2C 204 


6662 


1613 


3399 


5185 


6971 


784CIP2C_205 


6672 


1614 


3400 


5186 


6972 


784CIP2C_206 


6691 


1615 


3401 


5187 


6973 


784CIP2C_207 


6695 


1616 


3402 


5188 


6974 


784CIP2C_208 


6746 


1617 


3403 


5189 


6975 


784CIP2C_209 


6898 


1618 


3404 


5190 


6976 


784CIP2C_210 


6938 


1619 


3405 


5191 


6977 


784CIP2C_211 


6943 


1620 


3406 


5192 


<J97B 


784CIP2C_212 


7110 


1621 


3407 


5193 


6979 


784CIP2C_213 


7200 


1622 


3408 


5194 


6980" 


784CIP2C_214 


7212 


1623 


3409 


5195 


6981 


784CIP2C_215 


7218 


1624 


3410 


5196 


6982 


784CIP2C 216 


7249 


1625 


3411 


5197 


6983 


784CIP2C_217 


7500 


1626 


3412 


5198 


6984 


784CIP2C_216 


7509 


1627 


3413 


5199 


6985 


784CIP2C_219 


7523 


1628 


3414 


5200 


6986 


7B4CIP2C_220 


7544 


1629 


3415 


5201 


6987 


784CIP2C_221 


7564 


1630 


3416 


5202 


6988 


784CIP2C_222 


7568 


1631 


3417 


5203 


6989 


784CIP2C_223 


7631 


1632 


3418 


5204 


6990 


7B4CIP2C_224 


7813 


1633 


3419 


5205 


6991 


784CIP2C_225 


7831 


| 1634 


3420 


5206 


6992 


784CIP2C 226 


7843 


1635 


3421 


5207 


6993 


784CIP2C_227 


7907 


1636 


3422 


5208 


6994 


784CIP2C_228 


7943 


1^37 


3423 


5209 


6995 


784CIP2C 229 


8175 


1638 


3424 


5210 


6996 


784CIP2C 230 


8216 ! 


1639 


3425 


5211 


6997 


784CIP2C_231 


8225 | 


1640 


3426 


5212 


6998 


784CIP2C_232 


8271 1 


1641 


3427 


5213 


6999 


784CIP2C 233 


8397 


i£4i 


3428 


5214 


7000 


784CIP2C_234 


6466 


1643 


3429 


5215 


7001 


784CIP2C_235 


8503 | 


1644 


3430 


5216 


7002 


784CIP2C 236 


8953 


1645 


3431 


5217 


7003 


784CIP2C 237 


9106 


1646 


3432 


5218 


7004 


784CIP2C_238 


9139 


1647 


3433 


5219 


7005 


784CIP2C_239 


9555 


1648 


3434 


5220 


7006 


784CIP2C 240 


9650 


1649 


3435 


5221 


7007 


784CIP2C_241 


9889 


l£$0 


343£ 


5222 


7008 


784CIP2C 242 


9933 j 


1651 


3437 


5223 


7009 


7B4CIP2C__24 3 


9953 


1652 


3438 


5224 


7010 


784CIP2C_244 


9981 


1653 


3439 


5225 


7011 


784CIP2D_1 


746 


1654 


3440 


S2$6 


7012 


784CIP2D 2 


3558 


1655 


3441 


5227 


7013 


784CIP2D 3 


3558 


1656 


3442 


5228 


7014 


784CIP2D 4 


3633 


1657 


3443 


5229 


7015 


784CIP2D 5 


3658 


1658 


3444 


5230 


7016 


784CIP2D 6 


3732 


1659 


3445 


5231 


7017 


784CIP2D 7 


4004 


1660 


3446 


5232 


7018 


784C1P2D_8 


4700 


1661 


3447 


5233 


7019 


784CIP2D 9 


4703 


1662 


3448 


5234 


7020 


784CIP2D 10 


4774 


1663 


3449 


5235 


7021 


784CIP2D_11 


4894 


1664 


3450 


■ 5236 


7022 


784CIP2D_12 


4918 


1665 


3451 


5237 


7023 


784CIP2D 13 


5159 


1666 


3452 


5238 


7024 


784CIP2D_14 


7443 


1667 


3453 


5239 


7025 


784CIP2D_15 


8673 


1668 


3454 


5240 


7026 


784CIP2D_16 


8679 


1669 


3455 


5241 


7027 


784CIP2DJL7 


8727 


1670 


3456 


5242 


7028 


784CIP2D 18 


8734 


1671 


3457 


5243 


7029 


784CIP2D 19 


8756 
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SEQ ID NO: 
of full- 
length 
nucleotide 
sequence 


SEQ ID 
NO: of 
full- 
length 
peptide 
sequence 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO: 

of contig 

peptide 

sequence 


Priority- 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1672 




C 9 A A 


7030 


784CIP2D 20 


8818 


1673 


J fi 


5245 


7031 


784CIP2D_2l 


8844 


1674 


1 A c n 

J ft Q U 


5246 


7032 


784CIP2D_22 


8846 


1675 


OfiOl 


524 7 


7033 


784CIP2D 23 


8912 


1676 


1AC7 
J ft O .«£ 


b J4o 


7034 


784CIP2D_24 


8918 


1677 


w J 


524 9 


7035 


784CIP2D 25 


8918 


xo / o 


JflOfi 


5250 


7036 


784CIP2D_26 


8941 


1 fi*7 Q. 
lO / 7 




5251 


7037 


784CIP2D 27 


8941 


T con 


3466 


5252 


7038 


784CIP2D_28 


8951 




34 67 


5253 


7039 


784CIP2D 29 


8951 


' 1 6P.9 


3468 


5254 


7040 


784CIP2D 30 


9007 


Ivt) j 


3469 


5255 


7041 


784CIP2D 31 


9012 


XO OH 


3470 


5256 


7042 


784CIP2D 32 


9013 


1/TQC 

idoj 


3471 


5257 


7043 


784CIP2D 33 


9025 


i can 
Xo o o 


3472 


5258 


7044 


784CIP2D 34 


9053 


1 (Tot 
loo / 


3473 


5259 


7045 


784CIP2D 35 


9054 


1 (TRC 
1DOO 


3474 


5260 


7046 


784CIP2D 36 


9054 


1689 

XO O 7 




5261 


7047 


784CIP2D 37 


9113 




34 76 


5262 


7048 


784CIP2D_38 


9134 


1 9. 1 


j47 / 


5263 


7049 


784CIP2D_39 


9152 


1 CQ9 

xoy^ 


3478 


5264 


7050 


784CIP2D 40 


9152 


AOiJ J 


1 a "7 a 


5265 


7051 


784CIP2D 41 


9211 




3480 


5266 


7052 


784CIP2D_42 


9223 


XO y D 


3481 


5267 


7053 


784CIP2D 43 


9223 




3482. 


526 8 


7054 


784CIP2D 44 


9231 


ICQ'? 


3483 


5269 


7055 


784CIP2D 45 


9236 


1 COD 

lb y a 


3 4 84 


5270 


7056 


784CIP2D_46 


9236 


i coo 


3485 


5271 


7057 


784CIP2D 47 


9303 


1 /UU 


3486 


5272 


7058 


784CIP2D 48 


9309 


i 7m 
X /Ul 


34 87 


5273 


7059 


784CIP2D_49 


9314 


l ni\"i 

1 *\JJ. 


34 88 


5274 


7060 


784CIP2D 50 


9326 


X / Uj 


3489 


5275 


7061 


784CIP2D_51 


9339 


X / Ufx 


3490 


5276 


7062 


784CIP2D 52 


9*48 


1 r U9 


34 91 


5277 


7063 


784CIP2D 53 


9376 


X / Do 


3 4 92 


5278 


7064 


784CIP2D 54 


9382 


1707 




5279 


7065 


784CIP2D_55 


9407 


x / uo 


3494 


5280 


7066 


784CIP2D 56 


9414 




3495 


5281 


7067 


784CIP2D_57 


9439 


1710 


34 96 


5282 


7068 


784CIP2D 58 


9485 


1711 


*3 A a "i 
3 fl J* / 


5283 


7069 


784CIP2D 59 


9493 


1712 


3498 


5284 


7070 


784CIP2D_60 


9501 


X / X J 


3499 


5285 


7071 


784CIP2D 61 


9526 


171/1 
X / X*t 


JbUO 


5286 


7072 


784CIP2D 62 


9526 


1 il3 


3 501 


5287 


7073 


784CIP2D_63 


9551 


1 f ID 


3 502 


5288 


7074 


784CIP2D_64 


9557 


x r x / 


JJVJ 


5289 


7075 


784CIP2D_65 


9568 


1 71 R 
X r ID 


■jcn/i 


5290 


7076 


784CIP2D 66 


9588 


1719 


J d U b 


5291 


7077 


784CIP2D_67 


9597 


1 790 


3506 


5292 


7078 


784CIP2D 68 


9615 




3507 


5293 


7079 


784CIP2D_69 


9628 


1722 


3 508 


5294 


7080 


784CIP2D_70 


9649 


1723 


3509 


5295 


7081 


784CIP2DJ71 


9652 


1724 


3510 


5296 


7082 


784CIP2D_72 


9660 


1725 


3511 


CTQ7 


7083 


784CIP2D 73 


9662 


1726 


3512 


5298 


7084 


784CIP2D 74 


9725 


1727 


3513 


5299 


7085 


784CIP2DJ75 


9746 


1728 


3514 


5300 


70B6 


784CIP2D 76 


9777 | 


1729 


3515 


5301 


7087 


784CIP2D 77 


9787 


1730 


3516 


5302 


7088 


784CIP2D_78 


9790 


1731 


3517 


5303 


7089 


784CIP2D 79 


9B42 


1732 


3518 


5304 


7090 


784CIP2D 80 


9842 


1733 


3519 


5305 


7091 


784CIP2D 81 


9848 
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SEQ ID NO: 
of full- 
length 

tkU^>X cj c xuc 


SEQ ID 
NO: of 
full - 
lcay t n 


SEQ ID NO: 
of contig 
nucleotide 
sequence 


SEQ ID 
NO; 

of contig 

peptide 

sequence 


Priority 
docket number^ 
corresponding 
SEQ ID NO: in 
priority 
application 


SEQ ID 
NO: in 
U.S. S.N. 
09/488,725 


1734 


3520 


5306 




784CIP2D 82 


9867 


1735 


3521 


JJU / 


7093 


784CIP2D 83 


, 10010 


1736 


3522 


5308 


7 094 


784CIP2D 84 


10011 


1737 


3 523 


JjUj 


7095 


784CIP2D 85 


10052 


1738 


3524 


ciin 

JJiU 


nr\ac 

/uyo 


784CIP2D 86 


10057 ; 


1739 


3555 


j j 1 X 


7097 


784CIP2D 87 


10085 


1740 


JJtO 


jjIZ 


7098 


784CIP2D 89 


10139 


1741 


J J^ / 


3 J 13 


7099 


784CIP2D_90 


10142 


1742 


O JiO 




7100 


784CIP2D 92 


10165 


1743 


3529 


j J X b 


7101 


784CIP2D 93 


10173 


1744 


J j o u 


b J lb 


7102 


784CIP2D_94 


10173 


1745 


J J ol 


5317 


7103 


784CIP2D 95 


10273 


174 6 




53 18 


7104 


784CIP2EJL 


3121 


1747 


J JJJ 


5319 


7105 


784CIP2E 2 


3628 


174 8 


J b.3* 


5320 


7106 


784CIP2E 4 


3673 




3535 


5321 


7107 


784CIP2E_5 


4018 


1750 


"3 C 

j job 


5322 


Wt no 

7108 


784CIP2E 6 


4467 




3 537 


5323 


7109 


784CIP2E 7 


4865 | 


1752 




5324 


7110 


784CIP2E 8 


4916 


17m 




5325 


7111 


784CIP2E 9 


4923 


1754 


7 Cd n 


cToZ 

5326 


7112 


784CIP2E__10 


4926 


1755 




b 62. J 


7113 


784CIP2E 11 


4962 


1756 


J D'l Z 


5328 


7114 


784CIP2E_12 


4963 


1757 


J J'ij 


5329 


7115 


784CIP2E_13 


4964 


1758 


J J^4 


533 0 


7116 


784CIP2E 14 


4988 


1759 


-J Jfl j 


5331 


7117 


784CIP2EJL5 


5835 


1 7 £ 0 




5332 


7118 


784dP2E 16 


7682 


1761 


J b*i / 


5333 


7119 


784CIP2E_17 


7682 


1762 


3 548 


5334 


7120 


784CIP2E 18 


7699 


1 TCI 
1 / DO 


3 549 


5335 


7121 


784CiP2E_l9 


7707 


1764 


3 550 


5336 


7122 


784CIP2E_20 


7707 


1765 


*5 C C 1 

J bbl 


5337 


7123 


784CIP2E 21 


7752 


1766 


J bb^ 


5338 


7124 


784CIP2E__22 


8357 


X / O / 


3553 


5339 


7125 


784CIP2E_23 


9065 


1768 


3 554 


534 0 


7126 


784CIP2E_24 


9324 


1769 


J bbb 


5341 


7127 


784CIP2F 1 


2976 


1770 


j bbb 


5342 


7128 


784CIP2F_2 


3559 


1771 


j j j / 


5343 


7129 


784CIP2F_3 


4021 


1772 


JJDti 


5344 


7130 


784CIP2F_4 


4474 


1773 


J JJ? 


53 45 


7131 


784CIP2F 5 


4566 


1774 


jjov 


5346 


7132 


784CIP2F 6 


4705 


1775 


J JO J. 


5347 


7133 


784CIP2F_7 


4707 


177$ 





ci a q 

b 348 


7134 


784CIP2F 8 


4712 


1777 


3563 


b J4 y 


7135 


784CIP2F_9 


5008 


1778 


3564 


5350 


7136 


7B4CIP2F 10 


enno 

j v \jy 


1779 


3565 


5351 


7137 


784CIP2F 11 


5015 


1780 


3566 


5352 


7138 


784CIP2F_12 


5015 


1781 


35^7 


5353 


7139 


784CIP2F 13 


7724 


1782 


3568 


5354 


7140 


7B4CIP2F_14 


7725 


1783 


3569 


5355 


7141 


784CIP2F 15 


8828 


1784 


3570 


5356 


7142 


784CIP2F_16 


8830 


1785 


3571 


5357 


7143 


7^4CIP2F 17 


9739 


1784 


3572 


5358 


7144 


784CIP2F 18 


9896 
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TABLE 7 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=>Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
LoLeucine; M=»Methionine, N-Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5359 


337 


1131 


AHLSARLSALILDEVAILPAPQNLSVLSTNMKHLLMWSPVIAPG 
ETVYYSVEYQGEYESLYTSHIWIPSSWCSLTEGPECDVTDDITA 
TVPYNLRVRATLGSQTS/CLEHP/VSIPLIETQPSLPDL/RMEI 
TKDGFHLVIELEDLGPQFEFLVAYWRREPGAEEHVKMVRSGGIP 
VHLETM EPG AAY C VKAQT FVKA I G R YS AFS QT E C VE VQG E A I PL 
VLALFA F VG FML I LVWPL FVWKMGRLLQ / YLLL P RGG S SQTP W 
KITQF 


5360 


2 


1115 


PRVRSSGGQEDPASQQWARPRFTQPSKMRRRVIARPVGSSVRLK 
CVASGHPRPDITWMKDDQALTRPEAAEPRKKKWTIiSLKNLRPED 
SGKYTCRVSNRAGAINATYKVDVIQRTRSKPVLTGTHPVNTTVD 
FGGTT S FQC K VR S D V KP V I QW LKRVE YGAEGRHN S T I D VGGQ KF 
VVLPTGDVWSRPDGSYLNKLLITRARQDDAGMYICLGANTMGYS 
FRSAFLTVLPDPKPPGPPVASSSSATSLPWPWIGIPAGAVFIL 
GTLLL WLCQAQKKP CT PAP AP PL PGHR P PGTARD RS GD KDL P SL 
AALSAGPGVGLCEEHGSPAAPQHLLGPGPVAGPKLYPKLYTGHS 
TPHTYTHPPPSCQLNSSHS 


5361 


3 


925 


HEGS I S SAN I LLDDQFQPKLTDFAMAHFRSHLEHQS CT INMTS S 
SSKHLWYMPEEYIRQGKLSIKTDVYSFGIVIMEVLTGCRWLDD 
PKHIQLRDLLRELMEKRGLDSCLSFLDKKVPPCPRNFSAKLFCL 
AGRCAATRAKLRPSMDEVLNTLESTQASLYFAEDPPTSLKSFRC 
PS PLFLENVPS I PVEDDESQNNNLLPSDEGLRIDRMTQKTPFEC 
SQSEVMFLSLDKKPESKRNEEACNMPSSSCEESWFPKYIVPSQD 
LR P Y KVN I D P S S E APGHS CRS RP VE S S CS SKFS WD E YE Q YK KE 


5362 


2 


4879 


SCQVEGCTRTYNSSQSIGKHMKTAHPDQYAAFKMQRKSKKGQKA 
NNLNT PNNGK F V Y FLP S P VNS S N P F F T S QTKANGNPAC S AQLQH 
VSPPIFPAHLASVSTPLLSSMESVINPNITSQDKNEQGGMLCSQ 
MENLPSTALPAQMEDLTKTVLPLNIDRGSDPFLSLPAESSSIDL 
FPSPADSGTNSVFSQLENNTNHYSSQIEGNTNSSFLKGGNGENA 
VFPSQVNVANNFSSTNAQQSAPEKVKKDRGRGQTGKERKPKHNK 
RAKWPAI I RDGKFI CSRC YRAFTNPR SLGGHLS KRS YCKPLDGA 
EIAQELLQSNGQPSLLASMILSTNAVNLQQPQQSTFNPEACFKD 
PSFLQLLAENRSPAFLPNTFPRSGVTNFNTSVSQEGSE1 I IQAL 
E TAG I P S T FEGAE M LS HVS TG CVS DASQ VNATVM PN P TVPP LLH 
T VCH PNTLLTNQNRTSNS KTS S I EECS S L P VFPTNDL LLKT VEN 
GLCSSSFPNSGGPSQNFTSNSSRVSVISGPQNTRSSHLNKKGMS 
AS KRRKKVAPPLIAPNASQNLVTSDLTTMGLIAKSVE IPTTNLH 
SNVIPTCEPQSLVENLTQKLNNVNNQLFMTDVKENFKTSLESHT 
VLAPLTLKTENGDSQMMALNSCTTSVNSDLQISEDNVIQNFEKT 
LEIIKTAMNSQILEVKSGSQGAGETSQNAQINYNIQLPSVNTVQ 
NNKLPDSSP\FSSFISVMPTESNIPQSE\VSHKEDQIQEILEGL 
QKLKLENDLSTPASQCVLINTSVTLTPTPVKSTADITVIQPVSE 
M I N I Q FND KVNKP F V CQNQGCNYS AMT KDALFKHYG K I HQ YT P E 
M I L E I K KNQLKFAP F KC W P T CT KTFTRNSNLRAHCQL VHH FTT 
EEMVKLKIKRPYGRKSQSENVPASRSTQVKKQLAMTEENKKESQ 
PALELRAETQNTHSNVAVIPEKQLIEKKSPDKTESSLQVITVTS 
EQ CNTNALTNTQTKGR KI RRHKKE KEE KKRKKP VS QS L EFPTR Y 
SPYRPYRCVHQGCFAAFTIQQNLILHYQAVHKSDLPAFSAEVEE 
ESEAGKESEETETKQTLKEFRCQVSDCSRIFQAITGLIQHYMKL 
HEMTPEE I E S MTAS VDVGKFPCDQLECKS S FTTYLNYWHLEAD 
HG I GLRAS KTEEDG VYKCD CEG CDR I YATRS NLLRH I FNKHND K 
HKAHLIRPRRLTPGQENMSSKANQEKSKSKHRGTKHSRCGKEGI 
KMPKTKRKKKNNLENKNAKIVQIEENKPYSLKRGKHVYSIKARN 
DALSECTSRFVTQYPCMIKGCTSWTSESNIIRHYKCHKLSKAF 
TSQHRNLLIVFKRCCNSQVKETSEQEGAKNDVKDSDTCVSESND 
NSRTTATVSQKEVEKNE*DEMDELTELFITKLINEDSTSVETQA 
NTS SNVSND FQEDNLCQSERQKASNLKRVNKEICNVSQNKKRKVE 
KAEPASAAELSSVRKEEETAVAIQTIEEHPASFDWSSFKPMGFE 
VSFLKFLEESAVKQKKNTDKDHPNTGNKKGSHSNSRKNIDKTAV 
TSGNHVCPCKESETFVQFANPSQLQCSDNVKIVLDKNLKDCTEL 
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NO: 
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beginning 
nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C«Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F» Phenylalanine , G*Glycine, 
H=Histidine / I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, x=Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








VLKQLQEMKPTVSLKKLEVHSNDPDMSVMKDI S IGKATGRGQY 


5363 


806S 


703 


RL C CTGGG EGT PGAS G KRGP AATTS LVLC IPSVPPPVPF PTLW P 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RGPG PGLLLLAVLCLGTAVPS TGASKSKRQAQQMVQPQS PVAVS 
QS KPG CYDNG KH YQ I NQ QWERT YLGNAL VCTC YGGS RGFNCE S K 
P E AE E TC FD K YTGNT YR VGDT YERP KDS M I WDCTC I G AGRGR I S 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMZjECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
I TCTS RNRCNDQDTRTS YRIGDTWS KKDNRGNLLQC I CTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYSVGMQLA*KTQGNKQML\CTCLGNGVSCQETAVTQTYG 
GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 
KYS FCTDHTVLVQTRGGNSNGALCHFPFL YNNHNYTDCTS EGRR 
DNMKWCGTTQNYDADQKFGFCPMAAHEEICTTNEGVMYRIGDQW 
DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 
DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 
GDS W E KY VHG VR YQC YC YGRG IGEWHCQPLQTYPSS S GP VEV F I 
TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 
GHLNS YT I KGLKPG WYEGQL I S I QQ YGHQE VTRFDFTTTSTST 
PVTSNTWTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 
SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 
VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIWRWSR 
PQAP ITGYR I VYSPSVEGS STELNLPETANS VTLSDLQPGVQYN 
ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 
KVT I MWT P P E S AVTG YR VDV I P VNLPGEHGQRL PLS RNT F \ AEN 
TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 
ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 
PLRNLQPAS E YTVS LVA I KGNQE S P KATG VFTTLQPGS S I P P YN 
TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 
VSGLTPGVEYVYTIQVLRDGQERDAP\IVNK\VVTPLSPPTNLH 
LE ANP DTG VLTVS W ERS TTPD I TG YR I TTT PTNGQQGNS LEEW 
HADQS SCTF \ DNLEVPGLE YNVS VYTVKDDKES VP I SDTII PAV 
PPPTDLRFTN/ILGPDTMRVTW\APPPSIDLTNFLVRYSPVKNE 
GRMLQSLS I FFLSDN\AVVLTNLLPGTEYVVSVSSVYEQHESTP 
\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 
TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 
S IVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLI \SWD 
APAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATISGLKPGVD 
YT I TVYAVTGRGDS PAS S KP IS INYRTE I DKPSQMQVTDVQDNS 
ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 
EGLQPTVE Y WS VYAQNPSGESQ PLVQTAVTNT DRPKGLAFTD V 
DVDSIKIAWESPCX3QVSRYRVTYSSPEDGIHELFPAPDGEEDTA 
ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFT 
QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 
SVWSGLMVATKYEVSVYALKDTLTSRPAQGWTTLENVSPPRR 
AR VTD ATETT I T IS WRT KTET I TGFQ VDAVP ANGQT PIQRTIKP 
DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 
NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 
RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 
QLVTL PHPNLHG PE I LDVPSTVQKT P FVTHPGYDTGNG I QLPGT 
SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 
ALSQTTISWAPFQDTSEYIISCHPVGTDEEPLQFRVPGTSTSAT 
LTGLTRGATYNI I VEALKDQQRHKVREEVVTVGNS VNEGLNQPT 
DDS CFDPYTVS H YAVGDE WERMS E SG FKLLCQCLGFGSGHFRCD 
SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 
HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 
RPGGE P S P EGTTGQS YNQ YS QR YHQRTNTNVNC P I EC FM PLD VQ 
ADREDSRE 


53G4 


8066 


703 


RL CCTGGG EGTPGASGKRG P AATTS LVLC I PS VP P P VP FPTL WP 
P P SWRRQP PGG I RRDFSRRLRREANLVATCLPVRASLPHRLNML 
RG PGPGLLLLAVLCLGTAVPSTGASKS KRQAQQMVQPQS PVAVS 
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NO: 
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nucleotide 
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corre spondi ng 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=>Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glut amine, R^Arginine, 
S«Serine, T=Threonine , V=*Valine, 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QS KPGCYDNGKH YQ I NQQWER T YLGNALVCTC YGGSRG FNCES K " 

P E AE ETCF D K YTGNT YR VGDT YER P KDSM I WD CTC I G AGRG R I S 

CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 

CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 

I TCTS RNRCNDQDTRTS YRIGDTWS KKDNRGNLLQC I CTGNGRG 

EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 

DSGWYSVGMQLA*KTQGNKQML\CTCLGNGVSCQETAVTQTYG 

GNSNGEP CVL P FTYNGRTF YS CTTEGRQDGHLWCSTTSNYEQDQ 

KYSFCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 

DNMKWCGTTQNYDADQ KFG FC PMAAHE E I CTTNEGVM YR IGDQW 

DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 

DT FH KRH E E GHM LNCT C FGQGRGRWKCDP VDQCQDS E TGT F YQ I 

GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 

TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP- 

GHLNSYTIKGLKPGWYEGQLISIQQYGHQEVTRFDFTTTSTST 

PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 

SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 

VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIVVRWSR 

P QAP I TG YR I V Y S PS VE G S S TELNL P ETANS VTL3 DLQ P G VQ YN 

ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 

KVTIMWTPPESAVTGYRVDVIPVNLPGEHGQRLPLSRNTF\AEN 

TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 

ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 

PLRNLQPASEYTVSLVAIKGNQESPKATGVFTTLQPGSS I PPYN 

TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSG3IV 

VSGLTPGVEYVYTIQVLRDGQERDAP\lVNK\WTPLSPPTNIjH 

L EAN P DTGVLT VS WERSTTP DI TG YR I TTTPTNGQQGNS LE E W 

HADQSSCTF\DNLEVPGLEYNVSVYTVKDDKESVPISDTIIPAV 

PPPTDLRFTN/ILGPDTMRVTW\APPPSIDLTNFLVRYSPVKNE 

GRMLQS LS I F FL S DN \ A WL TNL L PG TE YWS VS S VYEQHE S TP 

\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 

TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 

SIVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLI\SWD 

APAVTVRY YR I T YGETGGNS PVQEFTVPGS KSTATI SGLKPG VD 

YT I TVYAVTGRGDS PAS S KP I S INYRTE I DKPS QMQVTDVQDNS 

ISVKWLPSSS PVTGYRVTTT\ PKNGPG \ PTKTKTAGPDQTEMTI 

EGLQPTVEYWSVYAQNPSGESQPLVQTAVTNIDRPKGIiAFTDV 

DVD S I K I AWE S PQGQ VS R YRVT YS S P EDG I HEL FP AP DG EEDTA 

ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFT 

QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 

S WVSGLMVATKYEVS VYAL KDTLTSRPAQGWTTLENVS P PRR 

ARVTDATETT IT I S WRTKTET I TGFQVDAVPANGQTP I QRT I KP 

DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 

NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 

RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 

Q L VTL P H PNLHG P E I LD VPST VQKTP F VTH PGYDTGNGIQL PGT 

SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 

ALSQTTI SWAP FQDTSEY 1 1 SCHPVGTDEEPLQFRVPGTSTSAT 

LTGLTRGATYN 1 1 VEALKDQQRHKVREEWTVGNSVNEGLNQPT 

DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 

S S R W CHDNG VNY K I GE KWDRQGENGQMMS CTCLGNG KG EFKCD P 

HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 

RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 

ADREDSRE 




80££ 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RGPG PGLLIiLAVLCLGTAVPSTGASKS KRQAQQMVQPQS PVAVS 
QS KPGCYDNGKH YQ INQQWERTYLGNALVCTCYGGS RG FNCE S K 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTI ANR CHEGGQS YK IG DTWRRPHETGG YMLEC VCLGNG KG E WT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L:= Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine / T=»Threonine , V«Valine, 
W=Tryptophan, Y-Tyrosine, X^Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








I TCTSRNRCNDQDTRTS YRIGDTWS KKDNRGNLLQC I CTGNGRG 

EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 

DSGWYSVGMQLA*KTQGNKQML\CTCLGNGVSCQETAVTQTYG 

GNSNGEPCVLPFTYNGRTFYSCTTEGRQDGHLWCSTTSNYEQDQ 

KYSFCTDHTVLVQTRGGNSNGALCHFPFLYNNHNYTDCTSEGRR 

DNMKWCGTTQNYDADQKFGFCPMAAHEEICTTNEGVMYRIGIDQW 

D KQHDMGHMMRCTCVGNGRGEWT C I AYSQLRDQC I VDD I T YNVN 

DTFHKRHEEGHMLiNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 

GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 

TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 

GHLNSYTIKGLKPGWYEGQLISIQQYGHQEVTRFDFTTTSTST 

PVTSNT\VTGETTPFSPLVATSESVTEITASSFVVSWVSASDTV 

SGFRVEYELSEEGDEPQYLVLPSTATSV\NXP\DLLPGRKYIVN 

VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIWRWSR 

PQAPITGYRIVYSPSVEGSSTELNLPETANSVTLSDLQPGVQYN 

ITIYAVEENQESTPWIQQETTGTPRSDTVPSPRDLQFVEVTDV 

KVTIMWTPPESAVTGYRVDVIPVNLPGEHGQRLPLSRNTF\AEN 

TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 

ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 

PLRNLQPASEYTVSLVAIKGNQESPKATGVFTTLQPGSSIPPYN 

TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 

VSGLT?GVEYVYTIQVLRDGQERDAP\IVNK\WTPLSPPTNLH 

LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEW 

HADQSSCTF\DNLEVPGLEYNVSVYTVKDDKESVPISDTIIPAV 

PPPTDLRFTN/ ILGPDTMRVTW\APPPSIDLTNFLVRYSPVKNE 

GRMLQSLSIFFLSDN\AWLTNLLPGTEYWSVSSVYEQHESTP 

\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 

TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 

SIVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLI\SWD 

APAVTVRYYRITYGETGGNSPVQEFTVPGSKSTATISGLKPGVD 

YT I TV YAVTGRGDS PAS S KP I S IN YRTE IDK PS QMQVTDVQDNS 

ISVKWLPSSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 

EGLQPTVEYWSVYAQNPSGESQPLVQTAVTNIDRPXGLAFTDV 

DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 

ELQGLRPGSEYTVSWALHDDMESQPLIGTQSTAIPAPTDLKFT 

QVTPTSLSAQWTPPNVQliTGYRVRVTPKEKTGPMKElNIiAPDSS 

S VWS G LMVAT K YE VS VYALKDT LTS R PAQGWTTL ENVS P PRR 

ARVTDATETTITISWRTKTETITGFQVDAVPANGQTPIQRTIKP 

DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 

NLRFLATTPNS LLVSWQP PRAR I TG Y 1 1 KYEKPGS P PRE WPRP 

RPGVTEATITGLEPGTEYTIYVIALKNNQKSEPLIGRKKTDELP 

QL VTL PHPNLHG PE I LD VPS TVQKTPFVTHPG YDTGNG I QLPGT 

SGQQ PS VGQQM I FE EHGFRRTTP PTTATP I RHRPRP YP PNVGQE 

ALSQTTISWAPFQDTSEYIISCHPVGTDEEPLQFRVPGTSTSAT 

LTGLTRGATYNIIVEALKDQQRHKVREEWTVGNSVNEGLNQPT 

DDSCFDPYTVSHYAVGDEWERMSESGFKIiLCQCLGFGSGHFRCD 

SSRWCHDNGVNYKIGEKWDRQGENGQMMSCTCLGNGKGEFKCDP 

HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 

RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 

ADREDSRE 


5366 


8066 


703 


RLCCTGGGEGTPGASGKRGPAATTSLVLCIPSVPPPVPFPTLWP 
PPSWRRQPPGGIRRDFSRRLRREANLVATCLPVRASLPHRLNML 
RGPGPGLLLLAVLCLGTAVPSTGASKSKRQAQQMVQPQSPVAVS 
QSKPGCYDNGKHYQINQQWERTYLGNALVCTCYGGSRGFNCESK 
PEAEETCFDKYTGNTYRVGDTYERPKDSMIWDCTCIGAGRGRIS 
CTIANRCHEGGQSYKIGDTWRRPHETGGYMLECVCLGNGKGEWT 
CKPIAEKCFDHAAGTSYWGETWEKPYQGWMMVDCTCLGEGSGR 
I T CTSRNR CNDQDTRTS YR I GDT WS KKDNRGNLLQC I CTGNGRG 
EWKCERHTSVQTTSSGSGPFTDVRAAVYQPQPHPQPPPYGHCVT 
DSGWYSVGMQLA* KTQGNKQML\ CTCLGNGVSCQETAVTQTYG 
GNSNGE P CVLP FT YNGR TF YS CTTEGRQDGHLWCSTTS W YEQDQ 
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SEQ 
ID 
.NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine r 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=»Serine, T=Threonine, V«Valine, 
VUTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KYS FCTDHT VLVQTRGGNSNGALCHFPFLYNNHN YTDCTSEGRR " 

DNMKWCGTTQNYDADQKFGFCPMAAHEEICTTNEGVMYRIGDQW 

DKQHDMGHMMRCTCVGNGRGEWTCIAYSQLRDQCIVDDITYNVN 

DTFHKRHEEGHMLNCTCFGQGRGRWKCDPVDQCQDSETGTFYQI 

GDSWEKYVHGVRYQCYCYGRGIGEWHCQPLQTYPSSSGPVEVFI 

TETPSQPNSHPIQWNAPQPSHISKYILRWRPKNSVGRWKEATIP 

GHLNSYTIKGLKPGVVYEGQLISIQQYGHQEVTRFDFTTTSTST 

PVTSNT\VTGETTPFSPLVATSESVTEITASSFWSWVSASDTV 

SGFRVEYELSEEGDEPQYLVLPSTATSV\NIP\DLLPGRKYIVN 

VYQISEDGEQSLILSTSQTTAPDAPPDPTVDQVDDTSIVVRWSR 

PQ A P I TG YR I V YS PS VEG S S TE LNL P ETANS VTLS D LQ PGVQ YN 

ITIYAVEENQESTPWIQQBTTGTPRSDTVPSPRDLQFVEVTDV 

KVTIMWTPPESAVTGYRVDVIPVNLPGEHGQRLPLSRNTF\AEN 

TGLSPGVTYYFKVFAVSHGRESKPLTAQQTTKL\DAPTNLQFVN 

ETDSTVLVRWTPPRAQITGYRLTVGLTRRGQPRQYNVGPSVSKY 

P LRNLQ PAS E YT VSLVAI KGNQES PKATGVFTTLQPGS SIP PYN 

TEVTETTIVITWTPAPRIGFKLGVRPSQGGEAPREVTSDSGSIV 

VSGLTPGVEYVYTIQVLRDGQERDAP\IVNK\WTPLSPPTNLH 

LEANPDTGVLTVSWERSTTPDITGYRITTTPTNGQQGNSLEEW 

HADQSSCTF\DNLEVPGLEYNVSVYTVKDDKES VPISDTI I PAV 

PPPTDLRFTN/lLGPDTMRVTW\APPPSIDLTNFIiVRYSPVKNE 

GRMLQSLSIFFLSDN\AWLTNLLPGTEYWSVSSVYEQHESTP 

\LRGRQKTGLDSP\TGIDFS\DITA\NSFT\VHW\IAPRA/TPI 

TGYRIR\HHPEHF\SGRPREDR\VPHSRNSITLTNLTPGTEYW 

SIVALNGREESPLLIGQQSTVSDVPRDLEWAATPTSLLI\SWD 

APAVT VR YYR I T YGETGGNS P VQE FTVPGS KS TAT I SGLKPG VD 

YTI TVYAVTGRGDS PASSKP I S INYRTE I DKPSQMQVTDVQDNS 

ISVKWL?SSSPVTGYRVTTT\PKNGPG\PTKTKTAGPDQTEMTI 

EGLQPTVEYWSVYAQNPSGESQPLVQTAVTNIDRPKGIAFTDV 

DVDSIKIAWESPQGQVSRYRVTYSSPEDGIHELFPAPDGEEDTA 

ELQGLRPGSEYTVSVVALHDDMESQPLIGTQSTAIPAPTDLKFT 

QVTPTSLSAQWTPPNVQLTGYRVRVTPKEKTGPMKEINLAPDSS 

SWVSGLMVATKYEVSVYALKDTLTSRPAQGWTTLENVSPPRR 

ARVTDATETTITISWRTKTETITGFQVDAVPANGQTPIQRTIKP 

DVRSYTITGLQPGTDYKIYLYTLNDNARSSPWIDASTAIDAPS 

NLRFLATTPNSLLVSWQPPRARITGYIIKYEKPGSPPREWPRP 

RPG VTEAT I TGLE PGTEYT I YVIALKNNQKSE P L I GRKKTDELP 

Q L VTL P HPNLHG P E I LDVP S T VQKT P F VTHPG YDTGNG I QLPGT 

SGQQPSVGQQMIFEEHGFRRTTPPTTATPIRHRPRPYPPNVGQE 

ALS QTT I S WAP FQDTS E YI I S CHP VG TDEEPLQ FR VPGTS TS AT 

LTGLTRGAT YNI I VEALKDQQRHKVREE WTVGNS VNEGLNQPT 

DDSCFDPYTVSHYAVGDEWERMSESGFKLLCQCLGFGSGHFRCD 

S SRWCHDNGVN Y KI GE KWDRQGENGQMMSCTCLGNGKGE FKCDP 

HEATCYDDGKTYHVGEQWQKEYLGAICSCTCFGGQRGWRCDNCR 

RPGGEPSPEGTTGQSYNQYSQRYHQRTNTNVNCPIECFMPLDVQ 

ADREDSRE 


5367 


23 5 


3591 


KKILNMLCKKNI VI E YLAD I LYE YLYGFCFSGIKKYL I IHVLRL 
ILELWMTRLLLEKSVSLQTQYLLlilVKILSWFPGKEMRHHLQIM 
EVMMRKQDS/RIVGNGSEQQLQKELADVLMDPPMDDQPGEKELV 
KRSQLDGEGDGPLSNQLSASSTINPVPLVGLOKPEMSLPVKPGQ 
GDSEAS S PFTP VADEDS WFS KLT YLGCAS VNAPRS E VEALRMM 
SILRSQCQISLDVTLSVPNVSEGIVRLLDPQTNTEIANYPIYKI 
LFCVRGHDGTPESDCFAFTESHYNAELFRIHVFRCEIQEAVSRI 
LYS FATAFRRSAKQTPLS ATAAPQTPDS D I FTFSVS LE I KEDDG 
KGYFSAVPKDKDRQCFKLRQGlDKKIVIYVQQTTNKEIiAIERCF 
GLLLSPGKDVRNSDMHLLDLESMGKSSDGKSYVITGSWNPKSPH 
FQ WNE E T P KDKVL FMTTAVDLV I TE VQ E P VR FLL ET KVR VCS P 
NERLFWPFSKRSTTEWFFLKLKQIKQRERKNNTDTLYEWCLES 
ESERERRKTTAS PS VRLPQSGSQSSVI PS PPEDDEEEDNDEPLL 
SG S G D VS KE CAEK I LE TWGELLS KWHLNLN VR PKQLS S L VRNG V 
PEALRGEVWQLLAGCHNNDHLVEKYRILITKESPQDSAITRDIN 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid seoment containina Ricmal n^nh iHp 
{A=Alanine, C=Cysteine, D»Aspartic Acid r E» 
Glutamic Acid, F« Phenyl alanine, G«Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RT FPAHD Y FKDTGGDGQDSLY KI CKAYS VY DEE IGYCQGQSFLA 
AVLLLHMPEEQAFSVLVKIMFDYGLRELFKQNFEDLHCKFYQLE 
RLMQEYIPDLYNHFLDISliEAHMYASQWFLTLFTAKFPLYMVFH 
IIDLLLCEGISVIFNVALGLLKTSKDDLLLTDFEGALKFFRVQL 
PKRYRSEENAKKLMELACNMK1SQKKLKKYEKEYHTMREQQAQQ 
EDP I E RFE RENRRLQEANMRL EQENDDLAHEL VTS K I ALR KDLD 
NAEEKADALNKELLMTKQKLIDAEEEKRRLEEESAHLKKMCRRE 
LDKAESEIKKNrSSIIGDYKQICSQLSERLEKQQTANKVEIEKIR 
QKVDDCERCREFFNKEGRVKG1SSTKEVLDEDTDEEKETLKNQL 
REMELELAQTKL\QLVEAECKIQD\LEHPF*GLPFNE\VQAA\K 
KTWFNRTLS S I KTATGVQGKETC 


5368 


573 


2014 


GAAAGAADPRRGS LGGRTMLDFAI FAVTFLLALVGAVL YLY PAS 
RO AAG I PG I T P TE E KDGNL PD I VKTS G SL.HE PL VKTLP P P VR PW c 

FWFGRRLWSLGTVDVLKQHINPNKTLD/LF*NTHAEVIIKVSIW 
WWQCE*KP\QRKKLYENGVTDSLKSNFALLLKLPEELLDKWLSY 
PETQH\VPLSQHMLGFAMKSVTQMVMGSTFEDDQEVIRFQKNHG 
TVWSEIGKGFLDGSLDKNMTRKKQYEDALMQLESVLRNIIKERK 
GRNFSQHIFIDSLVQGNLNDQQILEDSMIFSLASCIITAKLCTW 
AIWFLTTSEEVQKKLYEEINQVFGNGPVTPEKIEQLRYCQHVLC 
ETVRTAKLTPVSAQLQDI EGKIDRFI IPRETLVLYALGWLQDP 
NTWPSPHKFDPDRFTIDELVMKTFSSLGFSGTQECPELRFAYMVT 
TVLLSVLVKRLHLLSVEGQVIETKYELVTSSREEAWITVSKRY 


5369 


1 ~i 


6622 


PRSLCFSLWAEAAVLADGGLRRRRRLLRGTMSASFVPNGASLED 
CHCNLFCLADLTGIKWKKYVWQGPTSAPILFPVTEEDPILSSFS 
RCLKADVLG/ VWRRDQRPERRE \L* I FWGGEDP \VLLTLFTMTY 
QKKKMECGRMDFPMNAVLCFSKAVHNLLERCLMNRNFVRIGKWF 
VKPYEKDEKPINKSEHLSCSFTFFLHGDSNVCTSVEINQHQPVY 
LLSEEHI TLAQQSNS P FQVI LCP FGLNGTLTGQAFKMSDS ATKK 
LIGEWKQFYP I S CCLKEMSEE KQEDMD WEDD S LAAVE VLVAGVR 
MIYPACFVLVPQSDIPTPSPVGSTHCSSSCLGVHQVPASTRDPA 
MSSVTLTPPTSPEEVQTVDPQSVQKWVKFSSVSDGFNSDSTSHH 
GGKI P RKLANHVVDRVWQECNMNRAQNKRKYSAS SGGLCEEATA 
AKVASWDFVEATQRTNCSCLRHKNLKSRNAGQQGQAPSLGQQQQ 
ILPKHKTNEKQEKSEKPQKRPLTPFHHRVSVSDDVGMD\ADS\A 
SQRLV\ I S AP \ DSQ\ VR FSNI R\TJNTDVAK\ TPQMHGTEMANS PQ 
PPPLSP\HPCDWDEGVTKTPSTPQSQHFYQMPTPDPLVPSKPM 
EDRIDSLSQSFPPQYQEAVEPTVYVGTAVNLEEDEANIAWKYYK 
FPKKKDVEFLPPQLPSDKFKDDPVGPFGQESVTSVTELMVQCKK 
PLKVSDELVQQYQIKNQCLSAIASDAEQEPKIDPYAFVEGDEEF 
LFPDKKDRQNSEREAGKKHKVEDGTSSVTVLSHEEDAMSLFSPS 
IKQDAPRPTSHARPPSTSLIYDSDLAVSYTDLDNLFNSDEDELT 
PGSKRSANGSDDKASCKESKTGNLDPLSCISTADLHKMYPTPPS 
LEQH I M G FS PMNMNN KE YGS MDTTPGGT VLEGNS SSI GAQ FKI E 
VDEGFCSPKPSEIKDFSYVYKPENCQILVGCSMFAPLKTLPSQY 
LPLIKLPEECIYRQSWTVGKLELLSSGPSMPPIKEGDGSNMDQE 
YGTAYTPQTHTSCGMPPSSAPPSNSGAGILPSPSTPRFPTPRTP 
RTPRTPRGAGGPASAQGSVKYENSDLYSPASTPSTCRPLNSVEP 
ATVPS I PEAHSLYVNLILSES VMNLFKDCNSDSCCI CVCNMN I K 
GADVGVY I PDPTQEAQYRCTCGFSAVMNRKFGNNSGLFFEDELD 
1 1 GRNTDCGKEAE KR FEALRATS AEHVNGGLKE S EKLSDDL ILL 
LQDQCTNLFSPFGAADQDPFPKSGVISNWVRVEERDCCNDCYLA 
LEHGRQ FMDNMS GG KVDEAL V KS S CLH P WS KRND VS MQCSQD I L 
RML.LS LQ P VLQDA I Q KKRT VR P WGVQG P LTWQ Q FH KMAGRG S YG 
TDESPEPLPIPTFLLGYDYDYLVLSPFALPYWERLMLEPYGSQR 
DIAYWLCPENEALLNGAKSFFRDLTAIYESCRLGQHRPVSRLL 
TDG IMR VGSTAS KKLS EKLVAEWFSQAADGNNEAFSKLKLYAQV 
CRYDLGPYLASLPLDSSLLSQPNLVAPTSQSLITPPQMTNTGNA 
NTPSATLASAASSTMTVTSGVAISTSVATANSTLTTASTSSSSS 
SNLNSGVSSNKLPSFPPFGSMNSNAAGSMSTQANTVQS3QLGGQ 
QTSALQTAGISGESSSLPTQPHPDVSESTMDRDKVGIPTDGDSH 
AVTYPPAIWYIIDPFTYENTDESTNS SS VWTLGLLRCFLEMVQ 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G*Glycine, 
H=Histidine, I**Isoleucine , K=Lysine, 
L=Leucine, M^Methionine , N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=poesible nucleotide insertion) 








TLP PH 1 KS TVS VQ 1 1 PCQ YLLQ P VKHEDRE I Y PQHLKS LAFSAF 
TQCRRPLPTS TNVKTLTG FGPGLAMETALRS PDRPE C I RL YAP P 
r J. Lu\c v ruJivvJ- &u\3Ei irbiuioUMuvbr VuIv^ijorlL'wKWl J_iA£at. 
TDLYGELLETCIINIDVPNRARRKKSSARKFGIiQKLWEWCLGLV 
QM S SL PWR WI GRLGR I GHGELKDWS CLLS RRNLQSLS KRLKDM 
CRMCGISAADSPSILSACLVAMEPQGSFVIMPDSVSTGSVFGRS 
TTLNMQTSQLNTPQDTSCTHILVFPTSASVQVASATYTTENLDL 
AFNPNNDGADGMGIFDLLDTGDDLDPDIINILPASPTGSPVHSP 
GSHYPHGGDAGKGQSTDRLLSTEPHEEVPNILQQPLALGYFVST 
AKAQj^biHUWr WbACrOAQiQCPLr LKASLHLHVPSVQSDELLHS 
KHSHPLDSNQTSDVLRFVLEQYNALSWLTCDPATQDRRSCLPIH 
FWLNQLYNFIMNML 


5370 


1226 


716 


RWSRKLELRRAAQATESRPPQSQEMHPPTGKEVHALKRLRDSAN 
ANDVE T VQQ LLEDGAD PCAADDKG RTALHFASCNGNDQ I VQLLL 
DHGADPNQRDGLGNTPLHLAACTNHVPVITTLLRGGARVDALDR 
AGRTPLHLAKSKLNILQEGHAQCLKAVR/HGGEADHPYAEGVSG 
APRAT*AARCSGVFPSPSRWLGSAPWSRSSCTIWSLPLHEAKCR 
AVRPLSSAAQGSAPSSSSCCTVSTSIiALAESLSLFRACTSLPVG 


5371 


1331 


167 


IAAMLWKLLLRSQSCRLCSFRKMRSPPKYRPFLACFTYTTDKQS 
S KENTRTVE KLYKCS VD I RKIRR \ * KDGYF * RMKPMLKKLR I / F 
LQELGADETAVAS ILERCPEAI VCS PTAVNTQRKLWQLVCKNEE 
EL I KL I EQFPE S FFT 1 KDQENQKLNVQFFQELGLKNW I S RLLT 
AAPNVFHNPVEKNKQMVRILQESYLDVGGSEANMKVWLLKLLSQ 
NPF I LLNS PTAI KETLEFLQEQGFTS FE I LQLLSKLKG FLFQLC 
PRS IQNS I S FS KNAFKCTDHDLKQLVLKCPALLYYSVPVLEERM 
QGLLREGISIAQIRETPMVLELTPQIVQYRIRKLNSSGYRIKDG 
HLANLNGSKKEFEANFGKIQAKKVRPLFNPVAPLNVEE 


5372 


51 


857 


SPGAQFLWAAPDMPDPLFSAVQGKDEILHKALCFCPWLGKGGME 
P LRLL I LLF VTE LSG AHNTT VFQG VAGQS LQ VS C P YDS MKHWGR 
RKAW CRQLG E KG PCQR WS THNLW LLS FLRRWNG S TAI TDDTLG 
GTLTITLRNLQPHDAGLYQCQSLHGSEADTLRKVLVEVLADPLD 
HRDAGDLWFPG\DLRASRMPMWSTASPGASWKEKSPSHPLPSFS 
S W PAS FS S R F * Q P A P SGLQ PGMDRS QGH I HP VNWTVAMTQG I S S 
KLCQG 


5373 


2814 


346 


VKKTKSIFNSAMQEMEVYVENIRRKFGVFNYSPFRTPYTPNSQY 
QMLLD P TN P S AGTAK I DKQE KVKLN FDMTAS P KI LMS KP VLSGG 
TGRRISLSDMPRSPMSTNSSVHTGSDVEQDAEKKATSSHFSASE 
ESMD FLDKS TAS PASTKTGQAGSLSGS PKPFS PQLSAP I TTKTD 
KTSTTGSILNLNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQ 
IRSRFQLNLDKTIESCKAQLGINEISEDVYTAVEHSDSEDSEKS 
DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
TNPVEIKEELKSTSPASEKADPGAVKDKASPEPEKDFSGKAKPS 
PHPIKDKLKGKDETDSPTVHLGLDSDSE\NELVIDLGEDHSGRE 
UKJUMftJ^rJSJirfc^KUL/VVvlvlrro i 1 VoonorJ?i»AirVLTRSSAQ 
TSAAGATATTSTS ST VTVTAPAP AATG S P VKKQRP LLPKE \ TAP 
AVQRSCGTSSTVQQKEITQSPSTSTITLVTSTQSSPLVTSSGSM 
STLVSSVNGDLPIGTASADVAADIAKYTSKL\MDAIKGTM\TEI 
YNDLSKN\TTWKAQLAEDSQGLRIEIEKLQWLHQQEL\SEMKHN 
LELTMAEMRQSWEQERDRLIAEVKKQLELEKQQAVDETKKKQWC 
ANFKKEAIFYCCWNTSYCDYPCQ\QAHWPEH\MKSCTQSATAPQ 

EKSKESGSTLDLSGSRETPSSILLGSNQGSDHSR\SNKSSWSSS 
DEKRGS\TRSDHN/TPSTQHGRSLLPGKESRAGTPFLGTSK 


5374 


2814 


346 


VKKTKSIFNSAMQEMEVYVENIRRKFGVFNYSPFRTPYTPNSQY 
QMLLDPTNPSAGTAKIDKQEKVKLNFDMTASPKILMSKPVLSGG 
TGRR I S LSDMPR S PMS TNS S VHTGS D VEQDAE KKATS SH FS AS E 
ESMDFLDKSTAS PASTKTGQAGSLSGS PKPFS PQLSAP I TTKTD 
KTSTTGSILNLNLDRSKAEMDLKELSESVQQQSTPVPLISPKRQ 
IRSRFQLNLDKTXESCKAQLGINEISEDVYTAVEHSDSEDSEKS 
DSSDSEYISDDEQKS*GTSQEDTEDKEGCQMDKEPSAVKKKPKP 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine , G^Glycine, 
H=Histidine, I=*Isoleucine, K=Lysine, 
L-Leucine, M«Methionine , N«Asparagine , 
P«Proline, Q*Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=posBible nucleotide insertion) 








TNPVEIKEELKSTSPASEKADPGAVKDKASPEPEKkFSGKAKPS 
PHPIKDKLKGKDETDSPTVHLGLDSDSE\NELVIDLGEDHSGRE 
GRKNKKEPKEPSPKQDWGKTPPSTTVGSHSPPETPVLTRSSAQ 
TSAAGATATTSTS STVTVTAPAPAATGS P VKKQRPLLP KE \TAP 
AVQRSCGTSSTVQQKEITQSPSTSTITLVTSTQSSPLVTSSGSM 
STLVSSVNGDLPIGTASADVAADIAKYTSKL\MDAIKGTM\TEI 
YNDLSKN\TTWKAQLAEDSQGLRIEIEKLQWLHQQEL\SEMKHN 
LELTMAEMRQSWEQERDRLIAEVKKQLELEKQQAVDETKKKQWC 
ANFKKEAIFYCCWNTSYCDYPCQ\QAHWPEH\MKSCTQSATAPQ 
\QEADAE\VNTETLNKSSQGSSSSTQSAPSETASA\SKEKETSA 
EKSKESGSTLDLSGSRETPSSIIjLGSNQGSDHSR\SNKSSWSSS 
DEKRGS\TRSDHN/TPSTQHGRSLLPGKESRAGTPFLGTSK 


5375 


2907 


1116 


HIFLAEEEPMLERRCRGPLAMGPAQPRLLSGPSQESPQTLGKES 
RGLRQQGTSVA\QSGAQAPGRAHRCAHCRRHFPGWVA\LWLHTR 
RCQA/RGLPLPCPECGRRFRHAPFLALHRQVHAAATPDWGFACH 
LCGQS FRGWVALVLHLRAHSAAKAGPFACPKMARDAFWRRKAAS 
S S I LRRCH P S R PRG P R P F I CGNCGR S I LP T WDQ / LKVAHKR VHV 
SRRP*ERGPPAKVFWGPRPRGPPTGDTPPGPGGDAVDRPF\QCA 
CCGKRFRHK\PNLIRSHAACTSGERPHQ/CSRECG\KRFTNKPY 
LTS\HRRITHTARQPYPCKECGRRFRHKPNLLSHSKIHKRSEGS 
AQAAPGPGSPQLPAGPQESAAEPTPAVPLKPAQEPPPGAPPEHP 
QDPIEAPPSLYSCDDCGRSFRLERFLRAHQRQHTGERPFTCAEC 
GKNFGKKTHLVAHSRVHSGERPFRLARKCGRRFLPRASQSGGRN 
SAEPNAPRFGPFVCPDCGKAFRHKPYLAAHRPIATPAEKPYVCP 
DCRKAFSQKSNL\VSHRRIHTGERPYACPDCDRSFSQKSNLITH 
RKSHIRDGAFCCAICGQTFDDEERLLAHQKKHDV 


5376 


4504 


591 


VSTFSLCLWPAGGGGRGRVSNMAQSKRHVYSRTPSGSRMSAEAS 
ARPLRVGSRVEVIGKGHRGTVAYVGATLFATGKWVGVILDEAKG 
KNDGTVQGR K YFTCDEGHG I FVRQS Q I Q VFEDGADTTS PETPDS 
SAS KVLKREGTDTTAKTS KLRGLKP KKAPTARKTTTRRPKPTRP 
ASTGVAGASSSLGPSGSASAGELSSSEPSTPAQTPLAAPIIPTP 
VLTSPGAVPPLiPSPSKEEEGLRAQVRDLEEKLETLRLKRAEDKA 
KLKELEKHKIQLEQVQEWKSKMQEQQADLQRRLKEARKEAKEAL 
EAKERYMEEMADTADAIEMATLDKEMAEERAESLQQEVEALKER 
VDELTTDLE I LKAE IEEKGSDGAASS YQLKQLEEQNARIiKDALV 
RMRDLSSSEKQEHVK\LQKLMEKKNQEtiEWRQQRERLQEELSQ 
AESTIDELKEQVDAALGAEEMVEMLTDRNIiNLEEKVRELRETVG 
DLEAMNEMNDELQENARETELELREQLDMAGARVREAQKRVEAA 
QETVADYQQTIKKYRQLTAHLQDVNRELTNQQEASVERQQQPPP 
ETFDFKI KFAETKAHAKAIEMELRQMEVAQANRHMSLLTAFMPD 
SFLRPGGDHDCVLVLLLMPRLICKAELIRKQAQEKFELSENCSE 
RPGLRGAAGEQLSFAAIGLVY\SLMPAAGHRYHRY*CHALSQCR 
LD\VYKKVGSLYPEMSAHERSLDFLIELLHKDQLDETVNVEPLT 
KAIKYYQHLYS IHLAEQPEDCTMQLADHI KFTQSALDCWS VEVG 
RLRAFLQGGQEATD I ALLLRDLETSCS \DIRQFCKKIRRRMPGT 
DAPGI PAALAFGPQVSDTLLDCRKHLTWWAVLQEVAAAAAQL.I 
APLAENEGLLVAALEELAFKASEQIYGTPSSSPYECLRQSCNTL 
ISTMNK\LVTAMQEGEYDAERPPSKPPP\VELRAAALRAEITDA 
EGIjG L KLE DRETV I KE LKKS L KI KGEE LS EANVRLTLLE KKLDS 

AAKnADRHTRKVOT^TiPPTOAT.TiRXKPK'FPPPTMnJXT^nAnTnnT. 

n/A fV l-J/-\±J lit rv J- Lii\ V \£ k IV J J Hhi lit J ynuJJIvA rVIZj IV Hi C CjCi 1 I *U.r\xJ\^r\LJ JL LJ\JLj 

EAEKAELKQRLNSQSKRTIEGLRGPPPSGIATLVSGIAGEEQQR 
GA I PGQAPGS VPG PGLVKDS P L LLQQ I S AMRLH I S QLQHENS I L 
KG AQM KAS LASLP P LHVAKL S H EG PG S E L P AGAL YR KTSQLLET 
LNQLS THTH WDI TRTS PAAKS PSAQLMEQVAQLKS LSDTVEKL 
KDEVLKETVSQRPGATVPTDFATFPSSAFLRAKEEQQDDTVYMG 
KVTFSCAAGFGQRHRLVLTQEQIjHQLHSRLIS 


5377 


762 


1106 


DVPCKRVLPAEAQEKGQLTLSCGESGEEG\F*YHEVRQAEGES* 
/WFGPNVRLVHTQLKTKKPSGTLKAKFYLHTGSTKFAARISCTK 
S S * W PG YDGW WGGQY I FI FRGMRWEEQ P 


j 5378 


2009 


664 


QASGTT1.RPLPDLPQLKRREATSRNRALKPRGRLVLMTSCLPAL 
RF I AT PRL SAMPH I DNDVKLD FKD VLLRPKRSTL KS R S EVDLTR 
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amino acid 
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Predicted end 
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location 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W-Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 






( 


SFSFRNSKQTYSGVPIIAANMDTVGTFEMAKVLCKS*VPGSFWD 
VPQMGCVFLIYKLFTLKWKMLLLSVLLPASILVAEKFSLFTAVH 
KHYSLVQWQEFAGQNPDCLEHLAASSGTGSSDFEQLEQILEAIP 
QVKYICLDVANGYSEHFVEFVKDVRKRFPQHTIMAGtTVVTGEMV 
EEL X LSGAD 1 1 KVG 1 G PGSVCTTRKKTGVG Y PQLSAVMECADAA 
HGLKGHIISDGGCSCPGDVAKAFGAGADFVMLGGMLAGHSESGG 
ELI ERDGKKYKLF YGMS S * I \ AM \ KKYAGGVAE YRASEGKTVEV 
P FKGD VEHT I RDI LGG I R S T CT YVG AA KLK ELS RR TT FI R VTQQ 
VNPIFSEAC 


S379- 


2009 


664 


QASGTTLRPLPDLPQLKRREATSRNRALKPRGRLVLMTSCLPAL 
RFIATPRLSAMPHIDNDVFCLDFKDVLLRPKRSTLKSRSEVDLTR 
S FS FRNSKQTYSGVPI I AANMDTVGTFEMAKVLCKS * VPGS FWD 
VPQMGCVFLIYKLFTLKWKMLLLSVLLPASILVAEKFSLFTAVH 
KHYSLVQWQEFAGQNPDCLEHLAASSGTGSSDFEQLEQILEAIP 
Q VK Y I CLD VANG YS EH F VEF VKD VR KR FPQHTI MAGNWTGEMV 
EELILSGADIIKVGIGPGSVCTTRKKTGVGYPQLSAVMECADAA 
HGLKGHIISDGGCSCPGDVAKAFGAGADFVMLGGMLAGHSESGG 
EL I ERDGKKYKLFYGMSS*I\AM\ KKYAGGVAE YRASEGKTVEV 
PFKGDVEHTIRDILGGIRSTCTYVGAAKLKELSRRTTFIRVTQQ 
VNPIFSEAC 


5380 


2 


2050 


PS RAGGAERGRAAAARS PGGS AAGWECPS VLDEAGACTMS SCVS 
SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLSIHLGME 
SFIWTECEPGCAVDLGLARDRPLEADGQEVPLDTSGSQARPHL 
SGRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSP 
RLPRRPTVESHHVSITGMQDCVQLNQYTLKDEIGKGSYGVVKLA 
YNENDNTYYAMKVLSKKKLIRQAAFPRRPPPRGTRPAPGGCIQP 
RGPI\EQVYQEIA\ILKKLDHPNW\KLVEVL\DDPNEDHLYMV 
F\ELVNQGPVMEVPTLKPLSEDQARFYFQDLIKGIEYLHYQKII 

hVrdikpsnllvgedghikiadfgvsnefkgsdallsntvgtpa 
fmap e s ls e trki f sg kaldvw amg vtl yc f v fg * c p fmder i m 

CLHSKI KSQALE FPDQPD I AEDLKDLITRMLDKNPESR I WPE I 
KLHP WVTRHGAE PLP S EDENCTLVEVTEEEVENSVKKI PS LATV 
ILVKTMIRKRSFGNPFEGSRREERSLSAPGNLLTKKPTRECESL 
SELKT*KISPLPACCKVT*EFPHPSGCRPSCWQPPFLHTHSQPR 
*PEPPRTDEALCPYETGRTCWAPLLQVLWWVGTPLPFPLSTSWL 
PDL VGAPGS H FC FLN I ALLR YNS HTM 


5381 


2 


2050 


PSRAGGAERGRAAAARSPGGSAAGWECPSVLDEAGACTMSSCVS 
SQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLSIHLGME 
SFIWTECEPGCAVDLGLARDRPLEADGQEVPLDTSGSQARPHL 

sgrklslqersqgglaaggsldmngrcicpslpyspvsspqssp 
rlprrptveshhvsitgmqdcvqlnqytlkdeigkgsygwkla 
ynendntyyamkvlskkklirqaafprrppprgtrpapggciqp 
rgpi Veqvyqeia\ilkkldhpnw\klvevl\ddpnedhlymv 
f\elvnqgpvmevptlkplsedqarfyfqdlikgieylhyqkii 
h\rdikpsnllvgedghikiadfgvsnefkgsdallsntvgtpa 
fmapesi^etrkifsgkaldvwamgvtlycfvfg*cpfmderim 
clhskiksqalefpdqpdiaedlkdlitrmldknpesrivvpei 
klh p wvtrhgae p lps e denctlve vt eee vens vkh ips latv 
ilvktmirkrsfgnpfegsrreerslsapgnlltkkptrecesl 
selkt*kisplpacckvt*efphpsgcrpscwqppflhthsqpr 
♦pepprtdealcpyetgrtcwapllqvlwwvgtplpfplstswl 
pdlvgapgshfcflniallrynshtm . 


5382 


153 6 


203 


GARGSQQDAPALQEAEVRGPERAQPARGRMTKARLFRLWLVLGS 
VFM I LL 1 1 V YWDS AGAAH F YLHT S FS R P HTG PPL PT PG PDRDRE 
LT AD S DVD E FLD K FLS AG VKQSDL PR KETEQ P PA PG SMEE S VRG 
YDWSPRDARRSPDQGRQQAERRSVLRGFCANSSLAFPTKERPFD 
DIPNSELSHL I VDDRHGAI Y C YVP KVACTNW KRVM I VLS G SLLH 
RGAPYRDPLRIPREHVHNASAHLTFNKFWRRYGKLSRHLMKVKL 
KKYTKFLFVRDP FVRL I S AFRS KFELENEE F / * PQVRRAHAAAV 
RQPHQPARLGARGLPRWPQ\VSFANFIQYLLDPHTEKLAPFNEH 
WRQ V YRLCH P CQ I D YD FVG KLET LDEDAAQLLQLLQ VD LAAPL P 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y»Tyrosine, X-Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PELPGTGPPSSWEEDWFAKIPLAWRQQLYKLYEADPVLFGyPKP 
ENLLRD 


5383 


45 

! 


5250 


VE RLLGC RNS KRTWRM LIS KNM P W RRLQG I S FGMYS AE ELKKLS ' "' 

VKS I TNP R YLDS LGN P S ANG L YDLALG PAD S KEVC S TC VQDFSN 

CSGHLGH I EL PLTVYNPLLFDKL YLLLRGS CLNCHMLTCPRAV I 

HLLLCQLRVLEVGALQAVYELERILSRFLEENADPSASEIREEL 

EQ YTTE I VQNNLLGSQGAHVKNVCES KS KL I AL FWKAHMNAKRC 

PHC KTGR S VVR KEHNS KLT I T F P AMVHRT AGQ KD S E PLG I E E AQ 

IGKRGYLTPTSAREHLSALWKNEGFFLNYLFSGMDDDGMESRFN 

P S V F FLD FLWP P S RS R P VSRLGDQM FTNGQT VN LQ AVM KD WL 

IRKLLALMAQEQKLPEEVATPTTDEEKDSLIAIDRSPLSTLPGQ 

SLIDKLYNIWIRLQSHVNIVFDSEMDKLMMDKYPGIRQILEKKE 

GLFRKHMMGKRVDYAARSVICPDMYINTNEIGIPMVFATKLTYP 

QPVTPWNVQELRQAVINGPNVHPGASMVINEDGSRTALSAVDMT 

QREAVAKQLLTPATGAPKPQGTKI VCRHVKNGD I LLLNRQPTLH 

RPSIQAHRARIT..PEEKVLRLHYANCKAYNADFDGDEMNAHFPQS 

ELGRAEAYVLACTDQQYLVPKDGQPLAGHQDHMVSGASMTTRG 

CFFTREHYMELVYRGLTDKVGRVKLLSPSILKPFPLWTGKQWS 

TLLINIIPEDHIPLNLSGKAKITGKAWVKBTPRSVPGFNPDSMC 

ESQVIIREGELLCGVLDKAHYGSSAYGLVHCCYEIYGGETSGKV 

LTCLARLFTAYLQLYRGFTLGVEDILVKPKADVKRQRI I EESTH 

CGPQAVRAALNLPEAASYDEVRGKWQDAHLGKDQRDFNMIDLKF 

KEEVNHYSNEINKACMPFGLHRQFPENTLQLMVQSGAKGSTVNT 

MQISCLLGQIELEGRSTPLMASGKSLPCFEPYEFTPRAGGFVTG 

RFLTGIKPPEFFFHCMAGREGLVDTAVKTSRSGYLQRCIIKHLE 

GLWQYDLTVRDSDGSWQFLYGEDGLDIPKTQFLQPKQFPFLA 

SNYEVIMKSQHLHEVLSRADPKKALHHFRAIKKWQSKHPNTLLR 

RGAFLSYSQKIQEAVKALKLESENRNGR/RPWDS/G/RMLRMWY 

ELDEESRRKYQKKAAACPDPSLSVWRPDIYFASVSETFETKVDD 

YSQEWAAQTEKSYEKSELSLDRLRTLLQL\KWQRSLCEPGEAVG 

LLAAQS I GE P S TQMTLNTFHFAGRGE MNVTLG I PRLRE I LM VAS 

ANIKTPMMSVPVLNTKKALKRVKSLKKQLTRVCLGEVLQKIDVQ 

ESFCMEEKQNKFQVYQLRFQFLPHAYYOQEKCLRPEDILRFMET 

R FFKLLM E S I KKKNNKAS AFRNVNTRRATQRDLDNAG ELGRSRG 

EQEGDEEEEGHIVDAEAEEGDADASDAKRKEKQEEEVDYESEEE 

EEREGEENDDEDMQEERNPHREGARKTQEQDEEVGL/GH*GGPV 

PSRPPDAAPETHPQPGAPGA\EAMERRVQAVREIHPFIDDYQYD 

TEESLWCQVTVKLPLMKINFDMSSLWSLAHGAVIYATKGITRC 

LLNETTNNKNEKELVLNTEGINLPELFKYAEVLDLRRLYSNDIH 

AIANTYGIEAALRVIEKEIKDVFAVYGIAVDPRHLSLVADYMCF 

EGVYKPLNRFGIRSNSSPLQQMTFETSFQFLKQATMLGSHDELR 

S PSACL WGKWRGGTGLFELKQPLR 


5384 


196 


886 


QSCGQRLPTVL*L*GPPGSCPCILSLF\PGRPHALPEIRPYINI 
TILKGDKGDPGPMGLPGYMGREGPQGEPGPQGSKGDKGEMGSPG 
APCQKRFFAFSVGRKTALHSGEDFQTLLFERVFVNLDGCFDMAT 
GQFAAPLRG I YFFS LNVHS WW YKET YVH I MHNQKEAVI LYAQPS 

ERSIMQSQSVMLDLAYGDRVWVRLFKRQRENAIYSNDFDTYITF 
SGHLIKAEDD 


5385 


326 


799 


LMVPRTKKEAPAPPKAEAKAKAL\KAKKAVLKDVHSHKKNKIHM 
SPTFRRPKTL*LRRQPKYPWKSTPRRNKLDHHVIIKFPLTTE*A 
VKK I ENNS LL VFT VDVKANKHQI KQA VKK / L CD I D VA KVNTL I Q 
SDGERKAYVRLAPDYDALWATKIGIT 


5386 


326 


799 


LMVPRTKKEAPAPPKAEAKAKAL\KAKKAVLKDVHSHKKNKIHM 
SPTFRRPKTL*LRRQPKYPWKSTPRRNKLDHHVIIKFPLTTE*A 
VKKI BNNSLLVFTVDVKANKHQ IKQAVKK/ LCDIDVAKVNTL IQ 
S DGERKAY VRLAPD YDAL WATKI GI T 


5387 


2 




2117 


FWAASGGCWFVLGERRAGSLLSASYGTFAMPGMVLFGRRWAIA 
S DDLVFPGF FELWRVLWWIG I LTLYLMHRGKLDCAGGALLSS Y 
LIVLMILLAWICTVSAIMCVSMRGTICNPGPRKSMSKLLYIRL 
ALFF P EM VWAS LGAA WVADGVQCDRTWNG 1 I ATVWS W 1 1 1 AA 
TWS III VFDPLGGKMAP YSSAGPSHLDSHDS SQLLNGLKTAAT 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



SEQ 
ID 

NO: 



Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 



Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I=Isoleucine, K-Lyaine, 
L»Leucine, M«Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 



SVWETRIKLLCCCIGKDDHTRVAFSSTAELFSTYFSDTDLVPSD 
I AAGLALLHQQQDN I RNNQEP AQ WCHAPGS SQEADLDAELKNC 
HHYMQFAAAAYGWPLYIYRNPLTGLCRIGGDCCRSKNPQTMT/M 
VGGDQLQL/CTSAPILHTHRAAVQGLHPRQLPWTRFTELPFLVA 
LDHRKESVWAVRGTMSLQDVLTDLSAESEVLDVECEVQDRLAH 
KGISQAARYVYQRLINDGILSQAFSIAPEYRLVIVGHSLGGGAA 
ALLATMVRAAY PQVRCYAFS PPRGLWS KALQE YSQS F I VS LVLG 
KD V I P RLS VTNLE DLKRR I LR WAHCNKPK YK I L LHGLWYEL FG 
GNPNNLPTELDGGDQEVLTQPLLGEQSLLTRWSPAYSFSSDSPL 
DSSPKYPPLYPPGRIIHLQEEGASGRFGCCSAAHYSAKWSHEAE 
FSKIL I GPKMLTDHM PD I LMRALDS WS DRAACVSCPAQG VS S V 
DVA 



5388 



1569 



753 



TADGGAGGGGRRQAGVRRHYLYPFTGGYRRRRAACQAERPAARS 
KDTDLAAYQKGNLGVQLRNMAQETNHSQVPMLCSTGCGFYGNPR 
TNGMCSVCYKEHLQRQNSSNGRISPPVQCTDGSVPEAQSALDST 
SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVQASVS 
DTAQQPSEEQSKSLE\NRNKKRIAVSCAGRKWDLLGLNAGVEMF 
T W YT VTQM YT I ALT I T KQML KN F VFQQ E F KS FG S FHQQLLE Y K 
ILEHLQTKN 



5389 



1569 



753 



TAD G G AG GGGRRQAG VR RH YL YP FTG G Y RRRRAACq AE R P AAR S 
KDTDLAAYQKGNLGVQLRNMAQETNHSQVPMLCSTGCGFYGNPR 
TNGMCSVCYKEHLQRQNSSNGRISPPVQCTDGSVPEAQSALDST 
SSSMQPSPVSNQSLLSESVASSQLDSTSVDKAVPETEDVQASVS 
DTAQQPSEEQSKSLE\NRNKKRIAVSCAGRKWDLLGLNAGVEMF 
TWYTVTQMYTIALTITKQMLKNFVFQQEFKSFGS FHQQLLE YK 
ILEHLQTKN 



5390 



217 



1332 



5391 



EDPRKLMEDKMWSECEGPEMSLVCLTDFQAHAREQLSKSTRDFI 
EGGADDS ITRDDNI AAFKRI RLRPRYLRDVSEVDTRTTIQGEE I 
SAP I C I APTGFHCLVWPDGEMSTARAAQAA\G I CY I TSTFAS CS 
LEDIVIAAPEGLRWFQLYVHPDLQLNKQLIQRVESLGFKALVIT 
LDTP VCGNRRHD I RNQLRRNLTLTDLQS P KKGNAI P YFQMTP I S 
TSLCWNDLSWFQS ITRLPI ILKGILTKEDAELAVKHNVQGI IVS 
NHGGRQLDEVLAS I DALTE WAAVKGKI E VYLDGG VRTGNDVLK 
ALALGAKCIFLGDAILWALASKGEHGVKEVLNILTNEFHTSMA\ 
LTGCRSVAE INRNLVQFSRL 



1292 



5392 



VKKAAGRSRGPPTAGGQRCEEAPGTVMERRLGVRAWVKENRGSF 
QPPVCNKLMHQEQLKVMFVGGPNTRKDYHIEEGEEVFYQLEGDM 
VLRVLEQGKHRDWI RQGE I FLLPARVPH5PQRFANTVGLWER 
RRLETELDGLRYYVGDTMDVLFEKWFYCKDLGTQLAPIIQEFFS 
SEQYRTGKPIPDQLLKEPPFPLSTRSIMEPMSLDAWLDSHHREL 
QAGTPLSLFGDTYETQVIAYGQGSSEGLRQNVDVWLWQLEGSSV 
VTMGGRRLSLGPWMDSLLVLSWGPSY\AW\ERTQGSVALSVT\Q 
DPACKKS PWGEPSCHGLKAATGVPSTLEVPSLPNNSPS PHYLSV 
YCRCVPHRPAHCCHPPSCPSQPRCHAPGRAAAPHLLWQTQPTAL 
PVLPQGLP PAPLLP I PLSLQTQCS TS TPRR PS I KAS 



1623 



I RGSNAQKWGASGSGGAG PQPDPAGPGG VPALAAAVLGACE PR 
CAAPCPLPALSRCRGAGSRGSRGGRGAAGSGDAAAAAEWIRKGS 
FIHK PAHG WLH PDAR VLG PG VS Y WR YMGCI E VLRSMRS LDFNT 
RTQVTREAINRLHEAVPGVRGSWKKKAPNKALASVLGKSNLRFA 
GMSISIHISTDGLSLSVPATRQVIANHHMPSISFASGGDTDMTD 
Y VAYVAKDP INQRACH I LECCEGL\AQS I I STVGQAFELRFKQ Y 
LHSPPKVALPPERLAGPEESAWGDEEDSLEHNYYNSIPGKEPPL 
GGLVDSRLALTQPCALTALDQGPSPSLRDACSLPWDVGSTGTAP 
PGDGYVQADARGPPDHEEHLYVNTQGLDAPEPEDSPKKDLFDMR 
P FEDALKLHECSVAAGVTAAPLPLEDQWPS PPTRRAPVAPTEEQ 
LRQEPWYHGRMSRRAAERMLRADGDFLVRDSVTNPGQYVLTGMH 
AGQPKHLLLVDPEGWRTKDVLFESISHLIDHHLQNGQPIVAAE 
SELHLRGWSREP 



5393 



982 



GGDSAGMTMETQMSQNVCPRNLWLLQPLTVLLLLASADSQAAAP 
P KAVLKLE P PWINVLQ\ EDS VTLTCQGAPQ P / ERSDS IQWFHNG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G»Glycine, 
H=Histidine, I^Isoleucine, K«Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








\NLIPTHTQPS\YRFKANNN\DSGEYTCQTGQTSb\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMLRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHLDPTFSIPQANHSHSGDYHCTGNIGYTLFSSKPVTITV 
Q VPSMGS S S PMG 1 1 VAW I ATA VAA I VAA WAL I YCRKKR I SAN 
S TDP VKAAQFE P PGRQM I AI RKRQ LE ETNND YETADGG YMTLNP 
RAPTDDDKNIYLTLPPNDHVNSNN 


5394 


2 


982 


GGDSAGMTMETQMSQNVCPRNLWLLQPLTVLLLLASADSQAAAP 
PKAVLKLEPPWINVLQ\EDSVTLTCQGAPQP/ERSDSIQWFHNG 
\NLIPTHTQPS\YRFKANNN\DSGEYTCQTGQTSL\SDPVHLTV 
LSEWLVLQTPHLEFQEGETIMLRCHS\WRDKP\LVKVTFFQNGK 
SQKFSHLDPTFSIPQANHSHSGDYHCTGNIGYTLFSSKPVTITV 
QVPSMGSSSPMGII VAW I ATAVAA I VAAWAL I YCRKKR I SAN 
STDPVKAAQFEPPGRQMIAIRKRQLEETNNDYETADGGYMTLNP 
RAPTDDDKNI YLTLPPNDHVNSNN 


5395 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVPISKSTLSRSLSLQASDFDGAS 
SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 
KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 
LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 
RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 
SWDNQQENPPPTKKIGKKPVAKMPLRRPKMKKTPEKLDNTPASP 
PRSPAEPNDIPIAKGTYTFDIDKWDDPNFNPFSSTSKMQESPKL 
PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETPPVISAWHATDEEKLAVTNQKWTCMTVDLEADKQD 

YPOP?DT, < 5TFVMFTKT5' <: ?^PTPT7T,nYPMQ\rpTWV'MPirTr'QCT onn 
* tyrjyiw x c v xv Ci i i\c jo tr i. dLLLiiJ x txvio x & x £1 1 I ictlS. i. \jj o Litr^iJ 

DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
SEA I E I TAPEGS FASADALLSRLAHP VS LOGALD YLEPDLAEKN 
PPLFAQKLQREAAHPTDVS I SKTALYSRIGTAEVEKPAGLLFQQ 
PDLDS ALQI ARAE I I TKEREVS EWKDKYEESRRE VMSMRK I VAE 
YEKTIAQMIEDEQREKSVS\HQTVQQLVLEKEQA\loADLNSVEK 
\ S LADLFRR YE KMKE VLEG FRKNE EVLKRCAQE YLS RVKKEE QR 
YQALKVHA\EEKLDRANAE\ I AQVRGKAQQEQAAHQASLAERSS 
CRV\DA1>ERTLEQKNKEIEELTKICDELIAKMGKS 


5396 


3135 


531 


rasdaknqegllntrrkstdsvpiskstlsrslsLqasdfdgas 
ssgnpeavalapdaystgsssasstlkrtkkprppslkkkqttk 
kptetppvketqqepdeeslvpsgenlasetktesaktegpspa 
lle et p iis p aag p kaacplds es veg wp p asgggr vqns p p vg 
rktlplttapeagevtpsdsggqedspakghsvrlefdysedks 
swdnqqenppptkkigkkpvakmplrrpkmkktpekldntpasp 

PRSPAEPNDIPIAKGTYTFDIDKWDDPNFNPFSSTSKMQESPKL 
PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPA'SFEIPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETPPV I S AWHATDEE KLAVTNQKWTCMTVDLEADKQD 

YPnPQDTiQTPVMPTVPQ QPTPPT.nVDMQVR'TirVMPVT/aQaT nnr\ 
* K\£tr&LsJLt& if viXEi i Af oor x CuCiLiU xKlVo lijiD X rlCiivl uDobry U 

DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFBETE 
ALVNTAAKNQHPVPRGU^NQESHLQVPEKSSQKELEAMGLGTP 
S EAIE I TAPEGS FAS ADALLSRIiAHPVS LCGALD YLEPDLAEKN 
PPLFAQKLQREAAHPTDVS I SKTALYSR I GTAE VEKPAGLLFQQ 
PDLDS ALQ IARAEI ITKEREVSEWKDKYEESRREVMEMRKI VAE 
YEKTIAQMIEDEQREKSVS\HQTVQQLVLEKEQA\LADLNSVEK 
\S LAD LFRRYEKMKE VLEG FRKNEEVLKRCAOE YLS RVKKEEOR 
YQALKVHA\EEKLDRANAE\ I AQVRGKAQQEQAAHQAS LAER S S 
CRV\DALERTLEQKNKEIEELTKICDELIAKMGKS 


5397 


3135 


531 


RASDAKNQEGLLNTRRKSTDSVPISKSTLSRSLSLQASDFDGAS ' 

SSGNPEAVALAPDAYSTGSSSASSTLKRTKKPRPPSLKKKQTTK 

KPTETPPVKETQQEPDEESLVPSGENLASETKTESAKTEGPSPA 

LLEETPLEPAAGPKAACPLDSESVEGWPPASGGGRVQNSPPVG 

RKTLPLTTAPEAGEVTPSDSGGQEDSPAKGHSVRLEFDYSEDKS 

S WDNQ QENP P PTKK I GKKPVAKMPLRRPKMKKTPE KLDNTPAS P 

PRS PAE PND I P IAKGT YT FD I DKW DDPNFN P FSSTS KMQES PKL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P^Proline, Q=»Glutamine, R«Arginine, 
S=Serine, T=Threonine # V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *sStop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PQQSYNFDPDTCDESVDPFKTSSKTPSSPSKSPASFEIPASAME 
ANGVDGDGLNKPAKKKKTPLKTDTFRVKKSPKRSPLSDPPSQDP 
TPAATPETPPVISAWHATDEEKLAVTNQKWTCMTVDLEADKQD 
YPQPSDLSTFVNETKFSS PTEELDYRNS YE IE YMEKIGS SLPQD 
DDAPKKQALYLMFDTSQESPVKSSPVRMSESPTPCSGSSFEETE 
ALVNTAAKNQHPVPRGLAPNQESHLQVPEKSSQKELEAMGLGTP 
S E A I E I TA PEG S FAS ADALLS R LAHP VS LCGALD YLE PDLAE KN 
PPLFAQKLQREAAHPTDVSISKTALYSRIGTAEVEKPAGLLFQQ 
PDLDSALQIARAEIITKEREVSEWKDKYEESRREVMEMRKIVAS 
YEKTlAQMIEDEQREKSVS\HQTVQQLVLEKEQA\liADLNSVEK 
\ S LAD L FRR YE KM KE VLEG FRKNEE VL KR CAQE YL S R VKKEEQR 
YQAI)KVHA\ EEKLDRANAE \ IAQVRGKAQQEQAAHQASLAERSS 
CRV\DALERTLEQKNKEIEELTKICDELIAKMGKS 


5398 


56 


5426 


SGEVCRMESNFNQEGVPRPSYVFSADPIARPSEINFDGIKLDLS 

HEFSLVAPNTEANSFESKDYLQVCLRIRPFTQSEKELESEGCVH 

I LDS QTWL KEPQ C I LGRLS E KS S G \ QM \ AQKFS FFPGFLG PAT 

TQKEFFQGCIMHP\VKDLLKGQSRLIFTYGLTNSGKTYTFQGTE 

ENIRILPRTLNVLFDSLQERLYTKMNLKPHRSREYLRLSSEQEK 

EE I AS KSALLRQI KEVTVHNDSDDTLYGSLTNSLNISEFEE5 I X 

DYEQANLNMANSIKFSVWVSFFEIYNEYIYDLFVPVSSKFQKRK 

MIiRLSQDVKGYS F I KDLQW I QVS DS KE AYRLLKLG I KHQSVAFT 

KLNNASSRSHSIFTVKILQIEDSEMSRVIRVSELSLCDLAGSER 

TMKTQNEGERLRETGNINTSLLTLGKCINVLKNSEKSKFQQHVP 

FRESKLTHYF/QSFFNGKGKICMIVNISQCYLAYDETLNVLKFS 

AIAQKVCVPDTLNSSQEKLFGPVKSSQDVSIiDSNSNSKILNVKR 

ATISWEN3LEDLMEDEDLVEELENAEETED/VGETKLLDEDLDK 

TLEENKAFISHEEKRKLLDLIEDLKKKLINEKKEKLTLEFKIRE 

EVTQE FTQ Y WAQREAD FKETLLQERE I LEENAERRiAI FKDLVG 

KCDTREEAAKDI CATKVETEEATACLELKFNQI KAE LAKTKGE L 

I KTKEEL KKRENE S DS L I QELETSNKK 1 1 TQNQR I KEL I N 1 1 DQ 

KEDTINEFQNLKSHMENTFKCNDKADTSSLIIKNKLICNETVEV 

PKDSKSKICSERKRVNENELQQDEPPAKKGSIHVSSAITEDQKK 

S EEVRPN I AE I ED I R VLQENNEGLRAFLLTI ENELKNEKEEKAE 

LNKQIVHFQQELSLSEKKNLTLSKEVQQIQSNYDIAIAELHVQK 

SKNQEQEEKIMKLSNEIETATRSITNNVSQIKLMHTKIDELRTL 

DSVSQISNIDLLNLRDLSNGSEEDNLPWTQLDLLGNDYLVSKQV 

KEYRIQEPNRENSFHSSIEAIWEECKEIVKASSKKSHQIEELEQ 

Q I E KLQAE VKG YKDENNRL KE KEHKNQDDLLKE KE TL I QQLKE E 

LQEKNVTLDVQIQKWEGKRALSELTQGVTCYKAKIKELETILE 

TQKVERSHSAKLEQDILEKESIILKLERNLKEFQEHLQDSVKNT 

KDLNVKELKLKEEITQLTNNLQDMKHLLQLKEEEEETNRQETEK 

LKEELSASSARTQN\LNADLQRKEEDYADLKEKLTDAKKQIKQV 

QKEVSVMRDEDKLLRIKlNEIiEKKKNQCSQELDMKQRXTIQQLK 

EQL INQKVEEAI QQ YERAC KDLNVKE KI I EDMRMTL E EQE QTQ V 

EQDQVL\EAKLEEVERIiATELDRWRVKCNDLETKNNQRSNKEHE 

NNTDVLGKLTNLQDELQES EQKYNADRKKWLEEKMML I TQAKEA 

ENIRNKEMKKYAEDRERFFKQQNEMEILTAQLTEKDSDLQKWRE 

ERDQL VAALE I QLKAL I SSNVQKDNE I EQLKR 1 1 SETS K I ETQ I 

MD I KP KR I S S AD PDKLQTE P l*S TS FE I SRNKI E DGS WLDS CEV 

STENDQSTRFPKPELEIQFTPLQPNKMAVKHPGCTTPVTVKIPK 

ARKRKSNEMEEDLVKCENKKNATPRTNLKFPISDDRNSSVKKEQ 

KVAIRPSSKKTYSLRSQASIIGVNLATKKKEGTLQKFGDFLQHS 

PSILQSKAKKIIETMSSSKLSNVEASKENVSQPKRAKRKLYTSE 

ISSPIDISGQVILMDQKMKESDHQIIKRRLRTKTAK 


5399 


705 


230 


GPRMAKFLSQDQINEYKECFSLYDKQQRGKIKATDLMVAMRCLG 
AS PT P G EVQRHLQTHG I DGNGELD FS T FLT I MHMQ I KQ ED P KKE 
ILLAMLiMVDKEKKGYVMASDLRS KLTSIiGEKLTHKEV\ DDLFRE 
\ADIEPNGKVKYDEFIHKITSYLDGTY 


5400 


931 


246 


S H CS S GME I P PTN YP AS RAALVAQN Y I N YQQG T PHR VFE VQ KVK 
QASMEDIPGRGHKYRLKFAVEEIIQKQVKVNCTAEVLYPSTGQE 
TAPEVNFTFEGETGKNPDEEDNTFYQRLKSMKEPLEAQNI\PDN 
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SEQ 
ID 
NO: 


Preaicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Glut amine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 
W-Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\*=possible nucleotide insertion) 








FGNVSPEMTLVLHLAWVACGYIIWQNSTEDTWYKMVKIQTVKQV 
QRNDD FIE LD YT I L LHN I AS QE 1 1 PWQMQ VLWHPQ YGTKVKHNS 
RLPKEVQLE 


5401 


3 


1360 


TGWSYGPTTSLAFLAPRDFPFPPKLLIHPQAWRLSCGAGSMGS 
QAAAEWRNWASWEGSSSLSGCSMGCFKDDRIVFWTWMFSTYFME 
KWAPRQDDM LF Y VR RKLAY S GS E SG ADG RKAAE P E VE VE V YRRD 
S KKLPGLGDPDI DWEE S VCLNL I L.QKLDYMVTCAVCTRADGGD I 
HIHKKKSQQVFASPSKHPMDSKGEESKISYPNIFFMIDSF\EE\ 
VFSDMTVGKGEMVCVELVASDKTNTFQGVIFQGSIRYEALKKVY 
DNR VS VAARMAQK \ M S FG FS K YS NM E F \ VR\ M KG P QGKGHAEMA 
VSRVSTGDTSPCGTEEDSSPASPMHERVTSFSTPPTPERNNRPA 
FFSPSLKRKVPRNRIAEMKKSHSANDSEEFFREDDGGADLHNAT 
NLRSRSLSGTGRSLVGSWLKLNRADGNFLLYAHLTYVTLPLHRI 
LTDILEVRQKPILMT 


5402 


3445 


1563 


GECFIMAAWQQNDLVFEFASNVMEDERQLGDPAIFPAVIVEHV 
PGADILNSYAGLACVEEPNDMITESSLDVAEEEIIDDDDDDITL 
TVEASCKDGDETIETIEAAEALLNMDSPGPMLDEKRINNNIFSS 
PEDDMWAPVTHVSVTLDGIPEVMETQQVQEKYADSPGASSPEQ 
PKRKKGRKTKPPRPDSPATTPNISVKKKNKDGKGNTIYLWEFLL 
ALLQDKATCPKYIKWTQREKGIFKLVDSKPVSRLWRKHKNKP\D 
MNYEPMGRALRYYYQRGIIaAKVEGQRLVYQFKEMPKDLlYINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPSSVLRTVQPTQSPYPTQLFRTVHWQ 
PVQAVPEGEAARTSTMQDETLNSSVQSIR\TIQAPTQVPVWSP 
RNQQ \ LH T VTLQ TV PLTT V IAS TDPS AGTGS Q KF I LQ A I P S S Q P 
MTVLKENVMLQSQKAGS PPS I VLG PARV\QQ VLTSNVQTI CNGT 
VSV\ASSPSFS\ATAPWTLFLLGSSQLVAHPPGTVITSVIKTQ 
ETKTLTQEVEKKESEDHLKENTEKTEQQPQPYVMWSSSNGFTS 
QVAMKQNELLEPNSF 


5403 


3445 


1563 


GECFIMAAWQQNDLVFEFASNVMEDERQLGDPAIFPAVIVEHV 
PGADILNSYAGLACVEEPNDMITESSLDVAEEE I IDDDDDDITL 
TVEASCHDGDETIETIEAAEALLNMDSPGPMLDEKRINNNIFSS 
PEDDM WAPVTHVS VTLDG I PEVMETQQVQEKYADS PGASS PEQ 
PKRKKGRKTKPPRPDSPATTPNISVKKKNKDGKGNTIYLWEFLL 
ALLQDKATCPKYIKWTQREKGIFKLVDSKPVSRLWRKHKNKP\D 
MNYEPMGRALRYYYQRGILAKVEGQRLVYQFKEMPKDLIYINDE 
DPSSSIESSDPSLSSSATSNRNQTSRSRVSSSPGVKGGATTVLK 
PGNSKAAKPKDPVEVAQPSEVLRTVQPTQSPYPTQLFRTVHWQ 
PVQAVPEGEAARTSTMQDETLNSSVQSIR\TIQAPTQVPVWSP 
RNQQ\LHTVTLQTVPLTTVIASTDPSAGTGSQKFILQAIPSSQP 
MTVLKENVMLQSQKAGSPPSIVljGPARV\QQVLTSNVQTICNGT 
VSV\ASSPSFS\ATAPWTLFLLGSSQLVAHPPGTVITSVIKTQ 
ETKTLTQE VEKKE S E DH L KENT E KTEQQ P Q P YVMWS S SNG FTS 
QVAMKQNELLEPNSF 


5404 


187 


1111 


LPVTLIFAKMKTLQSTLLLLLLVPLIKPAPPTQQDSRIIYDYGT 
DNFEES I FSQDYEDKYLDGKN I KEKETVI I PNEKSLQLQKDEAI 
TPLPPKKENDEMPTCLLCVCLSGSVYCEEVDIDAVPPLPKESAY 
LYARFNKIKKLT\AKDFADIPNLRRLDFTGNLIEDIEDGTFSKL 
SLVEELSLAENQLLKLPVLPPKLTLFNAKYNKIKSRGIKANAFK 
KLNNLTFLYLDHNALESVPLNLPESLRVIHLQFNNIASITDDTF 
CKANDTSYIRDRIEEIRLEGNPIVLGKHPNSFICLKRLPIGSYF 


5406 


2199 
279 


1220 
2732 


QNSRSLHMDPQNQHGSGSSLWIQQPSLDSRPRLDYEREIQPTA 
I LSLDQ I KAI RGSNE YTEGPS WKRPAPRTAPRQE KHERTHE 1 1 
P INVNNNYEHRHTSHLGHAVLPSNARGPILSRS TS TGS AASSGS 
NS S AS S EQG LLGRS P PTRP VPGHRS ERAI RTQ PKQL I VDDLKGS 
LKEDLTQHKFICEQCGKCKCGECTAPRTLPSCLACNRQCLCSAE 
SMVEYGTCMCL\VKGIFYHCSNDDEGDSYSDNPCSCSQSHCCSR 
YLCMGAMSLFLPCLLCYPPAKGCLKLCRRCYDWIHRPGCRCKNS 
NTVYCKLESCPSRGQGKPS 

RWRTYNVEQPLTFMDVAIEFCLEEWQCLDTAQQNLYRNVMLENY 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I^Isoleucine, K«Lysine, 
L=Leucine, M«=Methionine , N=Asparagine , 
P»Proline, Q*=Glutamine , R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=s Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








RNLVFLG/ I IAVSKPDLITCLEQEKEPWEPMRRHEMVAKPPVMC 
SHFTQDFWPEQHIKDPFQKATLRRYKNCEHKNVHLKKDHKSVDB 
CK VHRGG YNG FNQCLP ATQS K I FL FDKC VKAFH KFSNSNRH K I S 
HTEKKLFKCKECGKSFCMLSHLAQHKIIHTRVNFCKCEKCGKAF 
NCPSIITKHKRINTGEKPYTCEECGKVFNWSSRLTTHKKNYTRY 
KLYKCEECGKAFNKSS ILTTHKI I RTGEKFYKCKECAKAFNQSS 
NIiTEHKKIHPGEKPYKCEECGKAFNWPSTLTKHKRIHTGEKPYT 
CEECGKAFNQFSNLTTHKRIHTA\EKFYKCTECGEAFSRS\SNL 
TKHKEIHTEKKPYKCEECGKAFKWSSKLTEHKLTHTOPTfPYTfr'P 
KCGKAFNCPS I ITKHNRINTGEKPYTCEECGKVFNWSSRLTTHK 
KNYTRYKLYKCEECX3KAFNKSSILTTHKKIHIEKKFYKCEECGK 
AFKWSSKLTEHKITHTGEKPYKCEECGKAFNHFSILTKHKRIHT 
GEKPYKCEECGKAFTQSSNIiTTHKKIHTGEKFYKCEECGKAFTQ 
SSNLTTHKKIHTGGKPYKCEECGKAFNQFSTLTKHKIIHTEEKP 
YKCEECGKAFKWSSTLTKHKIIHTGEKPYKCEECG\KAFKI>SST 
LSTHKIIHTGEKPYKCEKCGKAFNRPSNLIEHKKIHTGEQPYKC 
EECGKAFNYSSHLNTHKRIHTKEQPYKCKECGKAFNQYSNLTTH 
NKIHTGEKLYKPEDVTVILTTPQTFSNIK 


5407 


3 


659 


RPRRRQSSCCTGWliAGWLLRAAPRFCRRTETDMEQGKGLAVLIL 
AIILLQGTLAQSIKGNHIiVKVYDYQEDGSVLLTCDAEAKNITWF 
KDGKMIGFLTEDKKKWNLGSNAKDPRGMYQCKGSQNKSKPLQVY 
YRMCQNCIELNAATISGFLFAEIVS I FDLAVGVYFIAGTGMEFR 
QS\RASDKQTLLP\NDPAPTQPLKDPRKMTQYSHLQGN\QLRRN 


540B 


2745 


6128 


QGSKGTCHPQAQQPWDEGVWQEAPSQSEPWGQSQEPPTMPQRLP 
HARQHTPLPLGSADYRRWSVRPQGPHRDPKDSRDAAKREQGSL 
APR P V P AS RGG KTLCKG YRQA P PG P P AQ FQR P I C S AS P P WAS R F 
STPCPGGAVREDTYPVGTQGVPSLALAQGGPQGSWRFLEWKSMP 
RLPTDLDIGGPWFPHYDFERSCWVRAISQEDQLATCWQAEHCGE 
VRNKDMS WPEEMS F I ANSS K I DRH KVP TE KGATG LS NLGNTCFM 
NSSIQCVSNTQPLTQYFISGRHLYELNRTNPIGMKGHMAKCYGD 
LVQELWSGTQKNVAPLKLRWTIAKYAPRFNGFQQQDSQELLAFL 
LDGLHEDLNRVHEKPYVELKDSDGRPDWEV7^AEAWDNHLRRNRS 
IWDLFHGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL 
EITVIKLDGTTPVRYGLRLNMDEKYTGLKKQLSDLCGLNSEQIL 
LAEVHGSNIKNFPQDNQKVRLSVSGFLCAFEIPVPVSPISASSP 
TQTD FSSSPSTNEMF TLTTNGDLPR P I F I PNGM PNT WPCGTE K 
NFTOGMVNGHMPSLPDSPFTGYIIAVHRKMMRTELYFLSSQKNR 
PSLFGMPLIVPCTVHTRKKDLYDAVWIQVSRLASPLPPQEASNH 
AQDCDDSMGYQYPFTLRWQKDGNSCAWCPWYRFCRGCKIDCGE 
DRAFIGNAYIAVDWHPTALHLRYOTSOERVVDEHESVEO<Si?J?An 
VEPINLDSCLRAFTSEEELGENEMYYCSKCKTHCLATKKLDLWR 
LPPILIIHLKRFQFVNGRWIKSQKIVKFPRSSFDPSAFLVPRDP 
ALCQHKPLTPQGDELSEPRILAREVKKVDAQSSAGEEDVLLSKS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQ I GSKNKLS S S KENLDAS KENGAGQ ICELADALSRGH 
VLGGSQ P E L VTPQDHE VALANG F LYEH E ACGNGCGNG Y S NGQLG 
NHSEEDSTDDQREDTR I KP I YNLYAI SCHSG I LGGGHYVTYAKN 
PNCKWYCYNDSSCKELHPDEIDTDSAYILFYEQQGIDYAQFLPK 
TDGKKMADTS S MDEDFE S D Y \ E KYCVLQ 


5409 


2745 


6128 


qgskgtchpqaqqpwdeg^qeApSOs^pwgOsqepptmpqrlp 
harqhtplplgsadyrrwsvrpqgphrdpkdsrdaakreqgsl 
aprpvpasrggktlckgyrqappgppaqfqrpicsasppwasrf 
stpcpggavredtypvgtqgvpslaiaqggpqgswrflewksmp 

RLPTDLDI GGPWFPHYDFERS CWVRAI SQEDQLATCWQAEHCGE 
VRNKDMSWPEEMSFIANSSKIDRHKVPTEKGATGLSNLGNTCFM 
NS S I QC VS NTQ PLTQ YFISGRHLYE LNR TN P I GMKGHMAKC YGD 
L VQELWS GTQKNVAP L KLR WT I AK YAP R FNG FQQQDS Q ELLAF L 
LDGLHEDLNRVHEKPYVELKDSDGRPDWEVAAEAWDNHLRRNRS 
IWDLFKGQLRSQVKCKTCGHISVRFDPFNFLSLPLPMDSYMHL 
EITVIKLDGTTPVRYGLRLNMDEKYTGLKKQLSDLCGLNSEQIX, 
LAEVHGSNIKNFPQDNQKVRLSVSGFLCAFEIPVPVSPISASSP 



314 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine , C=Cysteine, D=»Aspartic Acid, E= 
Glutamic Acid, F^Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=» Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 




• 




TQTDFSSSPSTNEMFTLTTNGDI^PRPIFIPNGMPNTWPCGTEK 
NFTNGMVNGHMPS LPDS P FTGY 1 1 AVHRRMMRTELYFLS SQKNR 
PSLFGMPLIVPCTVHTRKKDLYDAVWIQVSRLASPLPPQEASNH 
AQDCDD S MG YQ Y P FTLR WQ KDGNS CAWC P W YR FCRG C KI DCGE 
DRAFIGNAYIAVDWHPTALHLRYQTSQERVVDEHESVEQSRRAQ 
VEPINLDSCLRAFTSEEELGENEMYYCSKCKTHCLATKKLDLWR 
LPPILIIHLKRFQFVNGRWIKSQKIVKFPRESFDPSAFLVPRDP 
ALCQHKPLTPQGDELSEPRILAREVKKVDAQSSAGEEDVLLSKS 
PSSLSANIISSPKGSPSSSRKSGTSCPSSKNSSPNSSPRTLGRS 
KGRLRLPQ IGS KNKLSSS KENLDAS KENGAGQ I CELADALS RGH 
VLGGSQPELVTPQDHEVAliANGFLYEHEACGNGCGNGYSNGQLG 
NHS E EDS TDDQREDTR I KP I YNL YAI S CHSG I LGGGHYVT YAKN 
PNCKW YC YNDS S CKELH PDE I DTDS AY I LFYEQQG I DYAQFLP K 
TDGKKMADTSSMDEDFESDY\EKYCVLQ 


5410 


2 


710 


LRFPGQARHVWLAARMQAPHKEHLYKLLVIGDLGVGKTS I IKRY 
VHQNFSSHYRATIGVDFALKVLHWDPETWRLQLWDIAGQERFG 
NMTRVYYREAMGAFIVFDVTRPATFEAVAKWKNDLDSKLSLPNG 
KP VS WLLANKCDQGKDVLMNNGLKMDQFCKEHG FVGW FETSAK 
EN IN I D EASR CL VKHI LAN E CDLME S I E P D WKPHLTS TKVAS C 
SG\CAKI LVGTFAGVW 


5411 


1302 


289 


TGPAAAGRRKALGSFGKPSPVTGLRAARRRRTRPSAPAAPSVGC 
G KRRESDAGAGGERAS VRTG SGRRGGRTMAGDS EQTLQNHQQ PN 
GGE P FL I G VSGGTAS G KS S VCAK I VQ LLG QN E VD YRQKQ W I LS 
QDSFYRVLTSEQKAKALKGQFNFDHPDAFDNELILKTLKEITEG 
KTVQIPVYDFVSHSRKEETVTVYPADWLFEGILAFYSQBR/IR 
DLFQMKLFVDTDADTRLSRRVLKDISERGRDLEQILSSSTLRFV 
KPA\FEEFCLPPK\KYADVIIPR\GADN\RVPINLIVQHIQ\DI 
LNGG P S \ NRQTNG CLNG YTPS R KRQAS E S S S R PH 


5412 


3180 


313 


QG I SNFFHKEANFWFEVSGYL I SPLRSPFVDPALEWSLMASPWN ' 
KMEGE S S R FE IHTP VSDKKKKK CS I H KE R PQKHSHE I FRDS S LV 
NEQSQITRRKKRKKDFQHLISSPLKKSRICDETANATSTLKKRK 
KRRYSALEVDEEAGVTVVLVDKENINNTPKHFRKDVDVVCVDMS 
I EQKL PR K \ PKTDKFQ VLAKS H \ AHKS EALHS KVRE KKNKKHQ R 
KAAS WES QRA\RDTLPQSE FPTQEES WLS VGPGGE I TELP \ ASA 
HKNKSKKKKKKSSNREYET\LAMPEGSQAGREAGTDMQESQPTV 
GLDDETPQLLGPTHKKKSKKKKKKKSNHQEFESLAMPEGSQVGS 
E VGADMQES \RPAVGLHGETAG I PAPAYKNKSKKKKKKSNHQE F 
EAVAMPESLESAYP EGSQVGSE VGTVEGSTALKGFKESNSTKKK 
SKKRKLTSVKRARVSGDDFSVPSKNSESTLFDSVEGDGAMMEEG 
VKSRPRQKKTQACLASKHVQEAPRLEPANEEHNVETAEDSEIRY 
LSADSGDADDSDADLGSAVKOLOEFI PN T KDHATlTTifPMVPnn 

LERFKEFKAQGVAIKFGKFSVKENKQLEKHVEDFiiALTGIESAD 
KLLYTDRYPEEKSVITNLKRRYSFRLHIG\RNIARPWKLIYYRA 
KKMFDVNNYKGRYSEGDTEKLKMYHSLLGNDWKTIGEMVARRSL 
SVALKFSQISSQRNRGAWSKSETRKLIKAVEEVILKKMSPQELK 
EVDSKLQENPESCLSIVREKLYKGISWVEVEAKVQTRNWMQCKS 
KWTEILTKRMTNGRRIYYGMNALRAKVSLIERLYEINVEDTNEI 
DWEDLASAIGDVPPSYVQTKFSRLKAVYVPFWQKKTFPEIIDYL 
YETTLPLLKE KLE KMM E KKGTK I QTP AAP KQVF PFRD I F YYE DD 
S EGGGHRKRKRRPRRHAWFTPVI PVLWEAKAGWI I 


5413 


3 753 


1304 


RFPAGVAPRRAMANVSKKVSWSGRDRDDEEAAPLLRRTARPGGG 
TPLLNGAGPGAARQS PRSALFRVGHMSS VKLDDELLE P \DMDP P 
HPFPKEIPHNEKLLSLKYESLDYDNSENQLFLEEERRINHTAFR 
TVEIKRWVICALIGILTGLVACFIDIWENLAGbKYRVIKGNID 
KFTEKGGLSFSLLLWATLNAAFVLVGSVIVAFIEPVAAGSGIPQ 
IKCFLNGVKI PHWRLKTLVI KVSGVILSWGGLAVGKEGPMIH 
SGSVIAAGISQGRSTSLKRDFKI FEYLRRDTEKRDFVSAGAAAG 
VS AAFGAP VGGVLFSLE EGAS FWNQFLTWR I FFASM I STFTLNF 
VLS I YHGNMWDLSS PGL INFGRFDS EKMAYTI HE I P VF I AMG W 
GG VLGAVFNALN YWLTMFRIR Y IHRPCLQVI EAVLVAAVTATVA 
FVLIYSSRDCQPLQGGSMSYPLQLFCADGEYNSMAAAFFNTPEK | 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acicl secrmpnt ratirai nina a i rrn^a 1 nortt-^a 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=sHistidine, I=Isoleucine, K«Lysine, 
L=Leucine, M*Methionine , N»Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, YeTyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=posaible nucleotide insertion) 








SWSLPHDPPGSYNPLTLGLFTLVYFFLACWTYGLTVSAGVFIP 
SLLIGAAWGRLFGISLSYLTGAAIWADPGKYALMGAAAQLGQIV 
RMTLSLTVIMMEATSNVTYGFPIMLVLMTAKIVGDVFIEGLYDM 
HIQLQSVPFLHWEAPVTSHSLTAREVMSTPVTCLRRREKVGVIV 
D VL S DTASNHNG FP WEHADDTQ P ARLQGL I LRS QL I VLL KHKV 
FVERSNLGLVQRRLRLKDFRDAYPRFPP IQS IHVSQDERECTMD 
LS E FMN PS P YT VP QE AS L PRVF KLFRALGLRHL WVDNRNQ WG 
LVTRKDLARYRLGKRGLEELSLAQT 


5414 


2130 


390 


GVASAWDRALFS PLLS PTSRVFRTS PPRCVSTETGRRDRARVPS 
QWCSVLQGKLPVSGRTSLACVRSILLSPASSPRKVGIVGGTGAR 
AGAAP RDHG R VRH RR P S S ARRM TR TTGQCLAPRG CQG PRG TR S P 
RSPRSRTRRGCSASPACLP/CRSALIVAVLCYINLLNYMDRFTV 
AGVLPDIEOFFNIGDSSSGL.TOTVFTc;cvMV7 &D\/yrVT nnDwr 
RKY LMCGG I AFW S L VTLG S S FI PG EH F WLLLLTRGLVGVG E AS Y 
STIAPTLIADLFVADQRSRMLSIFYFAIPVGSGLGYIAGSKVKD 
MAGDWHWALRVTPGLGWAVLLLFLWREPPRGAVERHSDLPPL 
NPTS W W ADLRALARN P S FVL S SLG FTAVAF VTGS LALWAPAFLL 
RSRVVLGETPPCLPGDSCSSSDSLIFGLITCLTGVLGVGLGVEI 
SRRLRHSNPRADPLVCATGLLGSAPFLFLSLACARGSIVATYIF 
IFIGETLLSMNWAIVADILLYVVIPTRRSTAEAFQIVLSHLXjGD 
AGS P YLIGLI SDRLRRNWPPSFLSEFRALQFSLMLCT^FVGALGG 
AAFLGTAHLH 


5415 


693 


2986 


I P PKTKLELQKH \LTTLT \NQEQAT I FEEVQKLRPRNEQRENEL 
IISFLRCLFEEKQKEHIHIGEMKQTSQMAAENIGSELPPSATRF 
RLDMLKNKAKR S LTES LE S I LSRGNKARGLQEHS I S VD LDSShS 
STLSNTSKEPSVCEKEALPISESSFKLLGSSEDLSSDSESHLPE 
EPAPLS PQQAFRRRANTLSHFPIECQEPPQPARGSPGVSQRKLM 
RYHSVSTETPHERKDFESKANHLGDSGGTPVKTRRHSWRQQIFL 
RVATPQKACDSSSRYEDYSELGELPPRSPLEPVCEDGPFGPPPE 
EKKRTSRELRELWQKAILQQILLLRMEKENQKLQASENDLLNKR 
LKLDYEE ITPCLKEVTTVWFK'Mr.ciTPfil? Q KT VT7nMt7VMUc n\mr\ 

GVP\RHHRGEIWKFLAEQFHLKHQFPSKQQPKDVPYKELLKQLT 
SQQHAILIDLGRTFPTHPYFSAQLGAGQLSLYNILKAYSLLDQE 
VGYCQGLSFVAGILLLHMSEEEAFKMLKFLMFDMGLRKQYRPDM 
IILQIQMYQLSRLLHDYHRDLYNHLEEHEIGPSLYAAPWFLTMF 
ASQFPLGFVARVFDM I FLQGTE VI FKVALSLLGSHKPLILQHEN 
LETIVDFIKSTLPNLGLVQMEKTINQVFEMDIAKQLQAYEVEYH 
VLQEELIDSSPLSDNQRMDKLEKTNSSLRKQNLDLLEQLQVANG 
RIQSLEATIEKLLSSESKLKQAMLTLELERSALLQTVEELRRRS 
AKPSDREPECTQPEPTGD 


5416 


27 


4074 


KS QLFCF WGGKAGDI LSGDQDKEQKDP YFVETP YG YQLDLDFL K 
YVDDIQKGNTIKRLNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLP P P S PQL P KHNLHVTKTLMETRRRLEQERATMQMTPG E F 
RRPRLAS FGGMGTTSSLPS FVGSGNHNPAKHQLQNG YQGNGD YG 
SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHLQHIREQM 
AIALKRLKELEEQVRTIPVLQVKISVLQEEKRQLVSQLKNQRAA 
SQINVCGVRKRSYSAGNASQLEQLSRARRSGGELYIDYEEEEME 
T VE Q S TQRI KE FRQL \ TADMQAL EQK I QDS S CE AS S ELRENGE C 
RS VAVG AE ENMND I VV YHRGS RS CKDAAVGTLVEMRNCGVS VTE 
AMLG VMTEADKE I E LQQQT IBS LKEK I YRLE VQLRET THDREMT 
KL KQELQAAG S R KKVDKATMAQ PLVFS KWEA WQTRDQM VG S H 
MDLVDTCVGTSVETNSVGISCQPECKNKWGPELPMNWWIVKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTLLKT 
NLNLKEVRS IGCGDCSVDVTVCS P KE CAS RG VNTE AVS QVEAAV 
MAVPRTADQDTSTDLEQVHQFTNTETATLIESCTNTCLSTLDKQ 
TSTQTVETRTVAVGEGRVKDINSSTKTRSIGVGTLLSGHSGFDR 
PSAVKTKESGVGQININDNYLVGLKMRTIACGPPQLTVGLTASR 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIERIQKLLAEQQ 
TLLAENYSELAEAFGEPHSQMGSLNSQLISTIjSSINSVMKSAST 
EELRNPDFQKTSLGKITGSYLGYTCKCGGLQSGSPLSSQTSQPE 
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NO: 


Predicted 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G«Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S^Serine, TVThreonine , V=Valine, 
W= Tryptophan, Y= Tyrosine, XsUnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QE VGTS EG KP I SS LDA FPTQEGTLS P VNLTDDQIAAGL YACTNN 
ESTLKSIMKKKDGNKDSNGAKKHLQFVGINGGYETTSSDDSSSD 
ESSSSBSDDECDVIEYPLEEEEEEEDEDTRGMAEGHHAVNIEGL 
KSARVEDEMQVQECEPEKVEIRERYELSEKMLSACNLLKNTIND 
PKALTSKDMRFCLNTLQHEWFRVSSQKSAIPAMVGDYIAAFEAI 
S PDVLRYVINLADGNGNTALHYS VSHSNFE I VKLLLDADVCNVD 
HQNKAG YT P I MI. AALA AVEAE KDMRI VEELFGCGDVNAKASQAG 
QTALMLAVSHGRIDMVKGLLACGADVNIQDDEGSTALMCASEHG 
HVEIVKLLLAQPGCNGHLEDNDGSTALSIALEAGHKDIAVLLYA 
HVNFAKAQSPGTPRLGRKTSPGPTHRGSFD 


5417 


27 


4074 


KS QLFCFWGGKAGD I LSGDQDKEQKDPYFVE TP YGYQLDLDFLK 
YVDDIQKGNTIKRLNIQKRRKPSVPCPEPRTTSGQQGIWTSTES 
LSSSNSDDNKQCPNFLIARSQVTSTPISKPPPPLETSLPFLTIP 
ENRQLPPPSPQLPKHNLHVTKTLMETRRRLEQERATMQMTPGEF 
RRPRLASFGGMGTTSSLPSFVGSGNHNPAKHQLQNGYQGNGDYG 
SYAPAAPTTSSMGSSIRHSPLSSGISTPVTNVSPMHLQHIREQM 
AIALKRLKELEEQVRTIPVLQVKISVLQEBKRQLVSQLKNQRAA 
SQINVCGVRKRSYSAGNASQLEQLSRARRSGGELYIDYEEEEME 
TVEQSTQRIKEFRQL\TADMQALEQKIQDSSCEASSELRENGEC 
RSVAVGAEENMNDIVVYHRGSRSCKDAAVGTLVEMRNCGVSVTE 
AM LG VMTE ADKE I E LQQQT I E S LKE KI Y RLE VQLRE TTHDREMT 
KLKQE LQAAGS RKKVD KATMAQ P L VFS KWEA WQTRD QMVGSH 
MDLVDTCVGTSVETNSVGZSCQPECKNKWGPELPMNWWIVKER 
VEMHDRCAGRSVEMCDKSVSVEVSVCETGSNTEESVNDLTLLKT 
NLNLKE VRS IG CGDCS VDVTVCS P KE CAS RGVNTEAVS QVEAAV 
MAVPRTADQDTSTDLEQVHQFTNTETATLIESCTNTCLSTLDKQ 
TSTQTVETRTVAVGEGRVKDINSSTKTRSIGVGTLLSGHSGFDR 
P SAVKTKESGVGQIN INDN YLVGLKMRT I ACGPPQLT VGLTASR 
RSVGVGDDPVGESLENPQPQAPLGMMTGLDHYIERIQKLLAEQQ 
TLLAENYSELAEAFGEPHSQMGSLNSQLISTLSSINSVMKSAST 
EELRNPDFQKTSLGKITGSYLGYTCKCGGLQSGSPLSSQTSQPE 
QB VGTSEGKP I S3 LDAFPTQEGTLS PVNLTDDQIAAGL YACTNN 
ESTLKSIMKKKDGNKDSNGAKKNLQFVGINGGYETTSSDDSSSD 
ESSSSESDDECDVIEYPLEEEEEEEDEDTRGMAEGHHAVNIEGL 
KSARVEDEMQVQECEPEKVEIRERYELSEKMIjSACNLLKNTIND 
PKALTSKDMRFCLNTLQHEWFRVSSQKSAIPAMVGDYIAAFEAI 
S PDVLRYV I NLADGNGNTALHYS VSHSNFE I VKLLLDADVCNVD 
HQNKAG YTPI MLAALAAVEAEKDMR I VEELFGCGDVNAKASQAG 
QTALM LAVS HGR I DMVKG LLACG ADVN I QDDEG S TALM CAS EHG 
HVEIVKLLLAQPGCNGHLEDNDGSTALSIALEAGHKDIAVLLYA 
HVNFAKAQSPGTPRLGRKTSPGPTHRGSFD 


5418 


24 


1133 


SVPRAGGDME TGAAEL YDQALLG I LQHVGNVQD FLRVLFG FLYR 
KTDFYRLLRHPSDRMGFPPGAAQALVLQVFKTFDHMARQDDEKR 
RQELEEKIRRKEEEEAKTVSAAAAEKEPVPVPVQEIEIDSTTEL 
DGHQEVEKVQPPGPVKEMAHGSQEAEAPGAVAGAAEVPR\EPPI 
LPRIQEQFQKNPDSYNGAVRENYTWSQDYTDLEVRVPVPKHWK 
GKQ VS VALS SS S I RVAMLEENGER VLMEGKLTHKINTES SLWS L 
EPGKCVLVNLSKVGEYWWNAILEGEEPIDIDKINKERSMATVDE 
EEQAVLDRLTFDYHQKLQGKPQSHELKVHEMLKKGWDAEGSPFR 
GQR FDPAMFN I S PGAVQ F 


5419 


1395 


259 


GTHPLDPDLVSRTSVQGPLMTMACPGMSDTEESPFLGPRAAEEG 
SESEACEAFGRRKSEEEGRRSDTSGFGRSRKHKVNWKHPERADA 
KDPASLPQC/LGP/DCVRPAQPSSKYCSDDCGMKLAANRIYEIL 
PQRIQQWQQSPCIAEEHGKKLLERIRREQQSARTRLQEMERRFH 
ELEAIILRAKQQAVREDEESNEGDSDDTDLQIFCVSCGHPINPR 
VALRHMERCYAKYES QTS FGS M YPTR IEGATRLFCDVYNPQS KT 
YCKRLQVLCPEHSRDPKVPADEVCGCPLVRDVFELTGDFCRLPK 
RQCNRHYCWEKLRRAEVDLERVRVWYKLDELFEQERNVRTAMTN 
RAGLLALMLHQTI QHDPLTTDLRSSADR 


5420 


117 


1733 


NEAGGACPFKGGASGRLYLSPRLPRVSVAGCEERPLGWVWVLGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 
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to first 
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residue of 
amino acid 
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Predicted end 
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location 
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Amino acid segment containing signal peptide 
. (A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, Kt=Lysine, 
L=Leucine, M=Methionine , N«Aeparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S*=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECIISTLLFATLYILCHIFLTRFKKPAEFTT\GMMKMPPSTRI>/ 
LLELCTFTLA I ALGAVLLLPFS 1 1 SNEVLLS LP RNYY I Q WLNGS 
L IHGLWNLVFLFSNLSLI FLMPFAYFFTESEGFAGSRKG VLGRV 
YETVVMLMLLTLLVLGMVWVASAIVDKNKANRESLYDFWEYYLP 
YL YS CIS FLG VLLLLVCT PLGLARMFS VTGKLLVKPRLLEDLEE 
QLYCSAFEFJ\ALTRRICNPTSCWLPLDMELLHRQVLALQTQRVL 
LEKRRKASAWQRNLGYPIAMLCLLVLTGLSVLIVAIHILELLID 
EAAM PRGM QGTS LGQVS FS KLGS FGAVI Q WL I F YLM VS S WGF 
YSSPLFRSLRPRWHDTAMTQIIGNCVCLLVLSSALPVFSRTLGL 
TRFDLLGDFGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 
RAELIRAFGERE 


5421 


117 


1733 


NEAGGACP FKGGAS GRL YLS PRLPRVS VAGCEERP LGWVW VbGG 
GGFLPARPPRAQRHLGFSHAEQSMEAPDYEVLSVREQLFHERIR 
EC I I STLLFATL Y I LCH I FLTRFKKPAEFTT\GMMKMPPSTRL/ 
LLELCTFTLAI ALGAVLLLPFS I ISNEVLLSLPRNYYIQWLNGS 
L I HGL WNLVFL FS NLS L I FLM P FA YF FTES EG FAG S RKG VLGRV 
YETVVMLMLLTLLVLGMVWVASAIVDKNKANRESLYDFWEYYLP 
YLYSCISFLGVLLLLVCTPLGLARMFSVTGKLLVKPRLLEDLEE 
QL YCS AFEEAALTR RICNPTSCWLP L DMELLHRQ VLALQTQ R VL 
LEKRRKASAWQRNLGYPLAMLCLLVLTGLSVLIVAIHILELLID 
EAAM P RGMQGTS LGQVS FS KLGS FGAVIQWL I FYLMVS S WGF 
YSSPLFRSLR PRWHDTAMTQI I GNC V CLL VLS S AL P VF S RTLG L 
TRFDLLGDFGRFNWLGNFYIVFLYNAAFAGLTTLCLVKTFTAAV 
RAELIRAFGERE 


5422 


3 


. 1263 


SCGESLPTWLAGASRPGIGRKGGAWGGRGGSSPAQVLLSPGPVF ' 

KAGCNWWHLSRDQAGVQRCDLGSSQPPPLGFKRFSCLSLPSSWD 

YRSTVLCVSKMEADLSGFNIDAPRWDQRTFLGRVKHFLNITDPR 

TVFVSERELDWAKVMVEKSRMGWPPGTQVEQLLYAKKLYDSAF 

HPDTGEKMNVIGRMSFQLPGGMIITGFMLQFYRTMPAVIFWQWV 

NQS FNALVNYTNRNAAS PTS VRQMALS Y FTATTTAVATAVGMNM 

LTKKAPPLVGRWVPFAAVAAANCVNI PMMRQQELI KGICVKDRN 

ENEIGHSRRAAAIGITQWISRITMSAPGMILLPVIMERLEKLH 

FMQKVKVL/SAPLQVMLSGCFLIFMVPVACGLFPQKCBLPVSYL 

EPKLQDTI KAKYGELEPYVYFNKGL 


5423 


3186 


905 


G VS MALGE EKAE AE AS EDTKAQ S YGRGS CRER ELD IPGPMSGEQ ' 
P PRLEAEGGLI S P VWGAEG I PAPTCW IGTDPGG PSRAHQPQASD 
ANRE P VAERS E P ALS GLP PATMGSGDLLLS GES QVEKTKLS S S E 
EFPQTLSLPRTTICSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LSCLSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 
GSLAKVSSSLEPWPQEPSSWGLGPRPQWSPQPVFSGGDASGL 
GRRRLS FQAE YWACVLPDSLPPS PDRHS PLWNPNKE YEDLLDYT 
YPLRPGPQLPKHLDSRVPADPVLQDSGVDLDSFSVSPASTLKSP 
TNVS PNCPPAEATALP FSG PREPSLKQWPS RVPQKQGGMGLAS W 
SQLASTPRAPGSRDARWERREPALRGAKDRLTIGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDEY 
LALPARLTQVSSLVSYLGSISTLVTLPTGDIKGQSPLEVSDSDG 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 
EGS LG S S Q ALG VS S GL L KTR P S LPARLDRW P FSD P D VEGQL PRK 
GGEQGKE SL VQC \ VKTF C \ CQLEEL I CWL YNV\ AD VTDHGTPAR 
SNLTSLK\ S S LQL YRQFKKD I DEHQSLTES VLQKGE I LLQCLLE 
NT P VL E D VLG R I AKQS GELE SHADRL YDS I LAS LDMLAG CTL I P 
DKKPMAAMEHPCEGV 


5424-' 


3186 


905 


GVSMALG EE KAEAEASEDTKAQSYGRGSCRERELD IPGPMSGEQ 
PPRLEAEGGLISPVWGAEGIPAPTCWIGTDPGGPSRAHQPQASD 
ANREPVAERSEPALSGLPPATMGSGDLLLSGESQVEKTKLSSSE 
EFPQTLSLPRTTICSGHDADTEDDPSLADLPQALDLSQQPHSSG 
LSCLSQWKSVLSPGSAAQPSSCSISASSTGSSLQGHQERAEPRG 
GSLAKVSSSLEPWPQEPSSWGLGPRPQWSPQPVFSGGDASGL 
GRRRLS FQAE YWACVLPDSLPPS PDRHS PLWNPNKE YEDLLDYT 
YPLRPGPQLPKHLDSRVPADPVLQDSGVDLDS FSVS PASTLKSP 
TNVS PNC PPAEATALPFSGPREPSLKQWPSRVPQKQGGMGLASW 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M»Methionine , N»Asparagine, 
PaProline, Q=Glutamine, R»Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQLASTPRAPGSRDARWERREPALRGAKDRLTIGKHLDMGSPQL 
RTRDRGWPSPRPEREKRTSQSARRPTCTESRWKSEEEVESDDEY 
LALPARLTQVSSLVSYLGSISTLVTLPTGDIKGQSPLEVSDSDG 
PASFPSSSSQSQLPPGAALQGSGDPEGQNPCFLRSFVRAHDSAG 
EGSLGSSQALGVSSGLLKTRPSLPARLDRWPFSDPDVEGQLPRK 

ggeqgkeslvqc\vktfc\cqleelicwlynv\advtdhgtpar 
snltslk\sslqlyrqfkkdidehqsltesvlqkgeillqclle 

NTPVLEDVLGR IAKQSGELESHADRLYDS ILASLDMLAGCTLIP 
DKKPMAAMEHPCEGV 


5425 


1086 


115 


GFCPSPSLGHQPPRVLHPTMSMAVETFGFFMATVGLLMLGVTLP 
NSYWRVSTVHGNVITTNTIFENLWFSCATDSLGVYNCWEFPSML 
ALSGYIQACRALMITAILLGFLGLLLGIAGLRCTNIGGLELSRK 
AKIiAATAGAPH\ ILPGICGMVAI \SWYAFNITR\DFSDPLYPGT 
KYELGPALYLGWSASLISILGGLCLCSACCCGSDEDPAASARRP 
YQAPVSVMPVATSDQEGDSSFGKYGRNALRVAALCRGPRCLPTA 
PKKRGPGRGPFPYSNLRGRPRPVPVAPPRPRPRVLHSHGPSQAK 
NCS WE VAYLPSEAGSL1 F 


5426 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP~ 

PAAHAKPDPGSGGQPAGPGAAGEALAVLTS FGRRLLVLI PVYLA 

GAVGLSVGFVLFGLALYLGWRRVRDEKERSLRAARQLLDDEEQL 

TAKTLYMSHRELPAWVSFPDVEKAEWLNKIVAQVWPFLGQYMEK 

L LAE TVAP A VRG S NPHLQT FTFTR VE LGEK P LR I IGVKVH PGQR 

KEQ1LLDLNISYVGDVQIDVEVKKYFCKAGVKGMQLHGVLRVIL 

EPLIGDLPFVGAVSMFFIRRPTLDINWTGMTNLLDIPGLSSLSD 

TMIMDS IAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGI IRIHL 

LAARGLS S KDKYVKGL I EG KS D P YAL VRU3TQTFCSRV I DEELN 

PQWGETYEVMVHEVPGQE I EVEVFDKDPDKDDFLGRMKLDVGKV 

LQASVLDDWFPLQGGQGQVHLRLEWLSLLSDAEKLEQVLQWNWG 

VSSRPDPPSAAILWYLDRAQDLPMVTSELYPPQLKKGNKEPNP 

MVQLSIQDVTQESXAVYSTNCPVWEEAFRFFLQDPQSQBLDVQV 

KDD S RAL TLGAL TL PLARLLTAP E L I LDQ W FQLSS SG PNS R L YM 

KLVMRILYLDSSEICFPTVPGCPGAWDVDSBNPQRGSSVDAPPR 

P CHTT P D SQ FG T E HVLR I HVLEAQD L I AKDR FLGGLVKG KS DP Y 

VKLKLAGRSFRSHVVREDLNPRWNEVFEVIVTSVPGQELEVEVF 

DKDLDKDDFLGRCKVRLTTVLNSGFLDEWLTLEDVPSGRLHLRL 

E RLT P R P T AAE LE E VLQ VN S L I QTQ KS AELAAALLS I YMERAED 

LPLRKGTKHLSPYATLTVGDSSHKTKTISQTSAPVWDESASFLI 

RKPHTESLELQVRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 

SSGQGQ VLLRAQLG I LVSQHSG VEAHSHS YSHS SS SLSEEPELS 

GGP PH I TS S APE V\ RQRLTHVDS PLEAPAGPLGQVKLTL W YYS E 

ERKLVS I VHGCRS LRQNGRD P P DP YVS LLLL P DKNRG TKRRTS Q 

KKRTLSP3FNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 

LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5427 


42 


3435 


ATSSQSLGRADPPRGGTMERSPGEGPSPSPMDQPSAPSDPTDQP 
PAAHAKPDPGSGGQPAGPGAAGEALAVLTSPGRRLLVLIPVYLA 
GAVGLSVGFVLFGLALYIX3WRRVRDEKERSLRAARQLLDDEEQL 
TAKTLYMSHRELPAWVSFPDVEKAEWLNKIVAQVWPFLGQYMEK 
LLAETVAPAVRGSNPHLQTFTFTRVELGEKPLRIIGVKVHPGQR 
KEQILLDLNI S YVGDVQIDVE VKKYFCKAGVKGMQLHG VLRVI L 
EPLIGDLPFVGAVSMFFIRRPTLDINWTGMTNLLDIPGLSSLSD 
TMIMDS IAAFLVLPNRLLVPLVPDLQDVAQLRSPLPRGI IRIHL 
LAARGL S S KDK YVKGLi I EG KS D P YAL VRLG TQT F C S R VI DE ELN 
PQWGETYEVMVHEVPGQE I EVEVFDKDPDKDDFLGRMKLDVGKV 
LQASVLDDWFPLQGGQGQVHLRLEWLSLLSDAEKLEQVLQWNWG 
VS S RPDPPS AA I L WYLDRAQDL PMVTS EL YPPQLKKGNKEPNP 
MVQLSIQDVTQESKAVYSTNCPVWEEAFRFFLQDPQSQELDVQV 
KDDSRALTLGALTLPLARLLTAPELILDQWFQLSSSGPNSRLYM 
KL VMR I L YLD S S E I CF PTV P G CPG AWD VDS ENPQRG S S VD AP PR 
PCHTTPDSQFGTEHVLRIHVLEAQDLIAKDRFLGGLVKGKSDPY 
VKLKLAGRSFRSHWREDLNPRWNEVFEVIVTSVPGQELEVEVF 
DKDUDKDDFLGRCKVRLTTVLNSGFLDEWLTLEDVPSGRLHLRL 
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corresponding 
to first 
amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M»Methionine , N=*Asparagine , 
P=Proline, Q=Glutamine / R=Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ERLTPRPTAAELEEVLQVNSLIQTQKSAELAAALLSIYMERAED" 
LPLRKGTKHLSPYATLTVGDSSHKTKTISQTSAPVWDESASFLI 
RKPHTESLELQVRGEGTGVLGSLSLPLSELLVADQLCLDRWFTL 
S SGQGQVLLRAQLG I LVS QHSGVEAHSHS YSHS S S S LS EE PELS 
GGPPHITSSAPEV\RQRLTHVDSPLEAPAGPLGQVKLTLWYYSE 
ERKLVSIVHGCRSLRQNGRDPPDPYVSLLLLPDKNRGTKRRTSQ 
KKRTLSPEFNERFEWELPLDEAQRRKLDVSVKSNSSFMSREREL 
LGKVQLDLAETDLSQGVARWYDLMDNKDKGSS 


5428 


3 


1839 


S S RS E RLS ACA I AP P WL VS SR PAR PAQLQR PGKMVEDGAE ELED 
LVHFSVSELPSRGYGVMEEIRRQGKLCDVTLKIGDHKFSAHRIV 
LAASIPYFHAMFTNDMMECKQDEIVMQGMDPSALBALINFAYNG 
NLA I DQQNVQS LLMGAS FLQLQS I KDACCTFLRERLHPKNCLGV 
RQFAETMMCAVLYDAANSFIHQHFVEVSMSEEFLALPLEDVLEL 
VSRDELNFVKS E EQVFEAALAWVR YDREQRGTFL \ RNLQSN I RLL 
FCRPQFLSDRVQQDDLVRCCHKCRDLVDEAKDYLLMPERRPHLP 
AFRTRPRCCTSIAGLIYAVGGLNSAGDSLNWEVFDPIANCWER 
CRPMTTARSRVGVAWNGLLYAIGGYDGQLRLSTVQAYNTETDT 
WTR VGSMNS KRS AMGT WLDGQ I YVCGGYDGNS S LSS VET YS PE 
TDKWTWTSMSSNRSAA\GVTVFEGRIYVSGGHDGLQIFSSVEH 
YNHHTATWHPAAGMLNKRCRHGAASLGSKMFVCGGYDGSGFLSI 
AEMYSSV\ADQWCLIVPM\HTRR\SRVSLGGPAVGRLYAVWGVT 
TGQSNL\SSVGDVLTPETDCWTFM\APMACHEGGVGVGCIPLLT 
I 


5429 


828 


202 


RREDALSSEGCLWPSESTVSGNGIPEPQVYAPPRPTDRLAVPPF 
AQRERFHRFQPTYPYLQHEIDLPPTISLSDGEEPPPYQGPCTLQ 
LRDPEQQLELNRESVRAPPNRTIFDSDLMDSARLGGPCPPSSNS 
GISATCYGSGGRMEGPPP\TYSEVIGHYPGSSFQHQQSSGPPSL 
LEGTRLHHTHIAPLESAAIWSKEKDKQKGHPL 


5430 


441 


1507 


QKRRKRRRKK IMKT I QP KMHNS I SWAI FTGLAALCLFQGVP VRS 
GDAT F P KAMDNVTVRQGE S ATLRCT I DNR VTR VAWLNRS T I L YA 
GNDKWCLDPRWLLSNTQTQYSIEIQNVDVYDEGPYTCSVQTDN 
HPKTSRVHLIVQVSPKIVEISSDISINEGNNISLTCIATGRPEP 
TVTWRH I S PKAVGFVS EDE YLE I QG I TREQSGDYE CS ASNDV \ A 
APV\VRRVKVTVNYPPYISEAKGTGVPVGQKGTLQCEASAVPSA 
E FQW YKDD KRL I / EG KKGVKVENRP F LS KL I FFNVS EHDYGN YT 
CVASNKLGHTNAS IMLFGPGAVS EVSNGTSRRAGCVWLLPLLVL 
HLLLKF 


5431 


2 


1312 


AAAAPGS RRRRPLPDRPHMAHG YEAPPP PAPRS PAWRARSKPV \ 
LPGITINP\TIAEGPSP\TSEGASEANLVDLQKKLEELELDEQQ 
KKRLEAFLTQKAKVGELKDDDFERISELGAGNGGVVTKVQHRPS 
GLIMARKLIHLEIKPAIRNQIIRELQVLHECNSPYIVGFYGAFY 
SDGEIS I CMEHMDGGSLDQVLKEAKRI PEEILGKVSIAVLRGLA 
YLREKHQIMHRDVKPSNILVNSRGEIKLCDFGVSGQLIDSMANS 
FVGTRSYMAPERLQGTHYSVQSDIWSMGLSLVELAVGRYPIPPP 
DAKELEAI FGRPWDGEEGEPHS ISPRPRPPGRPVSGHGMDSRP 
AMAIFELLDYIVNEPPPKLPNGVFTPDFQEFVNKCLIKNPAERA 
DLKMLTNHTFIKRSEVEEVDFAGWLCKTLRLNQPGTPTRTAV 


5432 


2 


1312 


AAAAPGS RRRRPLPDRPHMAHG YEAPPP PAPRS PAWRARSKPV\ 
LPGITINP\TIAEGPSP\TSEGASEANLVDLQKKLEELELDEQQ 
KKRLEAFLTQKAKVGELKDDDFERISELGAGNGGWTKVQHRPS 
GLIMARKLIHLEIKPAIRNQIIRELQVLHECNSPYIVGFYGAFY 
S DGE ISI CMEHMDGGSLDQVLKEAKRI PEE ILGKVS IAVLRGLA 
YLREKHQIMHRDVKPSNILVNSRGEIKLCDFGVSGQLIDSMANS 
FVGTRSYMAPERLQGTHYSVQSDIWSMGLSLVELAVGRYPIPPP 
DAKELEAI FGRPWDGEEGEPHS ISPRPRPPGRPVSGHGMDSRP 
AMAI FELLDYI VNEPPPKLPNGVFTPDFQEFVNKCLI KNPAERA 
DLKMLTNHTFIKRSEVEEVDFAGWLCKTLRLNQPGTPTRTAV 


5433 


360 


1885 


SVQEDKVGFEDPLHLCSWRARACPCTWPHC/CTGLLECLGFAGV 
L FG W P S LVF V F KNED Y FKDL CGPDAG P I GNATG Q ADC KAQD ER F 
SLI FTLGS FMNNFMTFPTGYIFDRFKTTVARLIAI FFYTTATLI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


* -L C- L-i *L L. " U CUU 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, ^Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=>Stop 
Codon, /=possible nucleotide deletion, ! 

\ — OOSsibJ Til ] f 1 pnl* ^ <^f* i n qo v 4 ah 1 








IAFTSAGSAVLLFLAMPMLTIGGILFLITNLQIGNLFGQHRSTI 
ITLYNGAFDSSSAVFLIIKLLYEKGISLR/VLLHLHLCLQYLAC 
STHFPPDAPGAHPI PTAPQLQLWPVPWEWHHKGREG /QQLSMKT 
GSYSQRSSFQRRKRPQGQGRSRNSAPSGATL/CSRRFAWHLVWL 

CAPWNGLLMDRLKQKYQKEARKTGSSTLAVALCSTVPSIiAiiTSL 
LCLGFALCASVPILPLQYLTFILQVISRSFLYGSNAAFLTLAFP 
S EH FGKLFG L VMAL S AWS LLQ F P I FTLI KGSLQNDPFYVNVMF 
MLAILLTFFHPFLVYRECRTWKESPSAIA 


5434 


66 


652 


RYAALIISLIQHKLLWRNQHCSRCVIMSPAQSAGLNWLF/GSG'K 
HGPFLGCSQYPACDYVRPLKSSADGHIVKVLEGQVCPACGANLV 
LRQGRFGMFIGCINYPECEHTBLIDKPDETAITCPQCRTGHLVQ 
RRSRYGKTFHSCDRYPECQFAINFKPIAGECPECHYPLLIEKKT 
AQG VKHFCAS KQCG KP VSAE 


5435 


4704 


1597 


PGDSSQRLAEMSNAKERKHAKKMRNQPTNVTLSSGFVADRGVKH ' 

HSGGEKPFQAQKQEPHPGTSRQRQTRVNPHSLPDPEVNEQSSSK 

GMFR KKGG WKAGPEGTS QE I P KY I TAS TFAQARAAE I S AMLKAV 

TQ KS SNS L VFQTLP RHMRRRAM S HNVKRL P RRLQE I AQKKAEKA 

VHQ KKEHS KNKCHKARR CHMNRTLE FNRRQ KKN I WL ETH I WHAK 

RFHMVKKWGYCLGERPTVKSHRACYRAMTNRCLLQDLSYYCCLE 

LKGKSEEILKALSGMCNIDTGLTFAAVHCLSGKRQGSLVLYRVN 

KYPREMLGPVrFIWKSQRTPGDPSESRQLWIWLHPTLKQDILEE 

IKAACQCVEPIKSAVCIADPLPTPSQEKSQTELPDEKIGKKRKR 

KDDGENAKPIKKIIGDGTRDPCLPYSWISPTTGI I ISDLTMEMN 

R FRL I G PLS H S I LTE A I KAAS VHTVG EDTE E TPHRW W I ET CKKP 

DS VS LHCRQEAI FELLGG ITSPAEI PAGTILGLTVGDPRINLPQ 

KKSKALPNPEKCQDNEKVRQLLLEGVPVECTHSFIWNQDICKSV 

TENK I S DQDLNRMRSE LLVPGSQL I LG PHES K IPILLIQQPGKV 

TGEDRLGWG S G WD VLLP KG WGMAFWI P F I YRG VR VGGLKESA VH 

SQ YKRS PNVPGDFPD C PAGMLFAEEQAKNLLE KYKRRP PAKRPN 

YVKLGTLAPFCCPWEQLTQDWESRVQAYEEPSVASSPNGKESDL 

RRSEVPCAPMPKKTHQPSDEVGTSIEHPRBAEEVMDAGCQESAG 

PERITDQEASENHVAATGSHLCVLRSRKLLKQLSAWCGPSSEDS 

RGGRRAPGRGQQGLTREACLSILGHFPRALVWVSLSLLSKGSPE 

PHTM I CVPAKE D FLQLHE DWH YCG P QE S KHSD P FR S KI LKQ KE K 

KKREKRQKP\GRASSDGPAGEEPVAGQEALTLGLWSGPLPRVTL 

HCSRTLLGFVTQGDFSMAVGCGEALGFVSLTGLLDMLSSQPAAQ 


5436 


1781 


635 


ASDSIPWSEARTTRKIiAQRGCQWSLPERMPLWFCGLPYSGKSR 
RAE EL R VALAAEGRA VYVVDDAAVLG AED PAVYGDS ARE KALRG 
ALRASVERRLSRHDVVILDSLNYIKGFRYELY\CLARAARTPLC 
LVYCVRPGGPIAGPQVAGANENPGRNVSVSWRPRAEEDGRAQAA 
GSSVLRELHTADSWNGSAQADVPKELERBESGAAESPALVTPD 
SEKSAKHGSGAFYSPELLEALTLRFEAPDSRNRWDRPLFTLVGL 
EEPLPLAGIRSALFENRAPPPHQSTQSQPIiASGSFLHQLDQVTS 
W LU\uijnaHSj)\£>j\ v ir WJJjLf TLPGTTEHLR r TR PLTMAELSRLRR 
QFISYTKMHPNNENLPQLANMFLQYLSQSLH 


5437 


739 ! 


1*72 


CQEAASEFGGPLHTPAMFLRRLGGWLPRPWGRRKPMRPDPPYPE 
PRRVDSSSENSGSDWDSAPETMEDVGHPKTKDSGALRVSRAASE 
PSKEEPQVEQLGSKRMDSLKWDQPISSTQESGRLEAGGASPKLR 
WDHVDS GG TRR PGVS P EGGL \ G VPG PGAP LE KPGRRE KLLGWLR 
GEPGAPSRYLGGPEEPT/^T^TKTT.TT.HT.T.PT.T.IVQZiT.T.ZVT.r'GonT "D 
AALDTLGLRGPLGLWLHGLLSFLAALHGLHAVLSLIiTAHPLHFA 
CLFGLLQALVLAVSLREPNGDEAATDWESEGLEREGEEQRGDPG 
KGL 


—5438 ■ 


2443 


1152 


TKPRKRRHQPASQRQRPWSSDSTGDLLARGKGRKEENKGSDRVS 
LAPPS LRRPMMCQSEARQGPELPJUVKWLHFPQLAIiRRRLGQLSC 
MSRPALKLRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
VPTRLLSRAWGRLNQVELPHWLRRPVYSLYIWTFGVNMKEAAVE 
DLHHYRNLSEFFRRKLKPQARPVCX3LHSVISPSDGRILNFGQVK 
NCEVEQVKGVTYSLESFLGPRMCTEDLPFPPAASCDSFKNQLVT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid kpctitipti t~ font* a i m" nn oi' nn^i _ _ _ ,_ ■ j _ 
(A*Alanine, CsCysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I~Isoleucine, K»Lysine, 
L n Lsucinc , M=»Methioninp Na&QnArarti na 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valir.e, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, | 
\=possible nucleotide insertion) 








REGNELYHCVIYLAPGDYHCFHSPTDWTVSHRRHPPGSLMSVNP * 
GMARWIKELFCHNBRWLTGDWKHGFFSLTAVGAT\NWGSIRIY 
FDRDLHTNSPRHSKGSYNDFSFVTHTNREGVPMALRGEHLG/QS 
FNLGSTIVLIFEAPKDFNFQLKTGOKIRFGEALGSL 


5439 


2443 


1152 


TKPRKRRHOPASORORPWSSDSTGDtit.ARf^^fiPK'P'R'Mvr'QnDue 
LAPPSLRRPMMCQSEARQGPELRAAKWLHFPQliALRRRLGQLSC 
MSRPALKLRSWPLTVLYYLLPFGALRPLSRVGWRPVSRVALYKS 
VP TRLLS RAWGR IiNQ VELPHWLRR P VYS L Y I WTFG VNMKEAAVE 
DLHHYRNLSEFFRRKLKPOARPVffiT.HQVTQPQnriDTT mwaw 

NCEVEQVKGVTYSLESFLGPRMCTEDLPFPPAASCDSFKNQLVT 
REGNELYHCVIYLAPGDYHCFHSPTDWTVSHRRHFPGSLMSVNP 
GMARW I KELFCHNERWLTGDWKHG FFS LTAVGAT \NWGS I R I Y 
FDRDLHTNSPRHS KGSYNDFS FVTHTNREGVPMALRGEHLG /QS 
FNLGSTI VLI FEAPKDPNFQLKTGQKIRFGEALGSL 


5440 


693 


253 


EPIPVTPDHRLVTMTHIV\QTFSPVNS\GQPPNYEMLKEEQEVA 
MLGAPHNPAPPMSTVIHIRSETSVPDHWWSLFNTLFMNTCCLG 
FIAFAYS VKS RDRKMVGDVTGAQAYAS TAKCLN I WAL I LG I FMT 
ILLIIIPVLWQAQR 


5441 


2 


2054 


CRDGGKNGFMVSPMKPLEIKTQCSGPRMDPKICPADPAFFSFIN ' 

NSDLWVANIETGEERRLTFCHQGLSNVLDDPKSAGVATFVIQEE 

FDRFTGYWWCPTASWEGSEGLKTLRILYEEVDESEVEVIHVPSP 

ALEERKTDSYRYPRTGSKNPKIALKLAEFQTDSQGKIVSTQEKE 

LVQPFSSLFPKVEYIARAGWTRDGKYAWAMFLDRPQQWLQLVLL 

P PALF I PSTENEEQ \ RLASARAV PRNVQP Y W YEEVTNVW INVH 

J-'J-r irr ryj^aKjauiLLi^c JjKAWhLKlXar LHIjYKVTAVLXSQGYDW 

SEPFSPGEGEQSLTNAIWVNEETKLVYFQGTKDTPLEHHLYWS 

YEAAGE I VRLTTPGFSHS CSMSQNFDMFVSHYSS VS TP PCVHVY 

KLSGPDDDPLHKQPRFWASMMEAAKIFHFHTRSDVRLYGMIYKP 

HALQ PGKKHPT VLFVYGG PQVQLVNNS F KG I KYLRLNTLASLG Y 

AWVIDGRGSCQRGLRFEGALKNQMGQVEIEDQVEGLQFVAEKY 

GFIDLSRVAIHGWSYGGFLSLMGLIHKPQVFKVAIAGAPVTVWM 

AYDTGYTERYMDVPENNQHGYEAGSVALHVEKLPNEPNRLLILH 

GFLDENVHFFHTNFLVSQLIRAGKPYQLQVALPPVSPQIYPNER 

HS IRCPESGEHYEVTLLHFLQEYL 


5442 


1 


34 74 


CG QRS RRRS P DM P EAKPAAKKAP KG KDAP KG AP KEAP P KE APAE 
APKEAPPEDQSPTAEEPTGVFLKKPDSVSVETGKDAWVAKVNG 
KELPDKPT I KWFKGKWLELGS KSGARFS FKESHNS ASNVYTVEL 
HIGKVVLGDRGYYRLEVKAKDTCDSCGFNIDVEAPRQDASGQSL 
ESFKRTSEKKSDTAGEIiDFSGLLKKREWEEEKKKKKKDDDDLG 
IPPEIWELLKGAKKSEYEKIAFQYGITDLRGMLKRLKKAKVEVK 
KS AAFTKKLDPA YQVDRGNK I KLM VE I S DPDLTL KWF KNGQE I K 
PSSKYVFENVGKKRILTINKCTLADDAAYEVAVKDEKCFTELFV 
KEPPVLIVTPLEDQQVFVGDRVEMAVEVSEEGAQVMWMKDGVEL 
TREDS F KAR YRFKKDG KRH I L I FS D WQEDRGR YQ VI TNGGQ CE 
AELIVEEKQLEVLQDIADLTVKASEQAVFKCEVSDEKVTGKWYK 
NGVEVRPSKRITISHVGRFHKLVIDDVRPEDEGDYTFVPDGYAL 
GSLSAKLNFLEIKVEYVPKQ\EPPKIPLGFASGGKTSENAD/IV 
WAGNKIiRLDV\SITGEAPSPFAT\WLKG\DEVFTTTEGRTRIE 
KR VDCSS FVI ES AQREDEGRYTI KVTNP IGED VAS I FLQ WDVP 
DPPEAVRITSVGEDWAILVWEPPMYDGGKPVTGYLVERKKKGSQ 
RWMKLNFE VFTSTTYESTKMIEG I LYEMRVFAVNAIGVSQPSMN 
TKPFMP IAPTSEPLHLI VED VTDTTTTLKWR P PNR IGAGG I DG Y 
L VE YCLEGS E E W VP ANTE P VER CG F T VKNLPTG AR I LFR WG VN 
IAGRSEPATLAQPVTIREIAEPPKIRLPRHLRQTYIRKVGEQLN 
LWPFQGKPRPQWWTKGGAPLDTSRVHVRTSDFDTVFFVRQAA 
RSDSGE YELS VQ I ENM KDTAT I R I R WE KAGPP INVMVKE VWGT 
NALVEWQAPKDDGNSEIMGYFVQKADKKTMEWFNVYERNRHTSC 
TVS DL» I VGNE Y Y FR V YTEN I CGLS DS PG VS KNTAR I LKTG I TFK 
PFEYKEHDFRMAPKFLTPLIDRWVAGYSAALNCAVRGHPKPKV 
VWMKNKME I REDPKFL I TKTYQGVLTLNIRRPS P FDAGT YTCRAV 
NELG EAliAECKLE VRVPQ 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E~ 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5443 


66 


1003 


SRGQLDAGQS SEQHGGNRQPEQS RSRSSSSSSS PRRS RS AAE PA 
MALSMPLNGLKEEDKEPLIELFVKAGSDGESIGNCPFSQRLFMI 
LWLKG W F S VTT VDLKR KP ADLQNLAPGTH P P F I TENS E VKTD V 
NKIEEFLEEVLCPPKYLKLSPKHPESNTAGMDIFAKFSAYIKNS 
RPEANEALERGLLKTLQKIjDEYLNSPLPDEIDENSMEDIKFSTR 
KFLDGNEMTLADCNLLPKLHIVKWAKKYRNFDIPKEMTGIWRY 
LTNAYSRDEFTNTCPSDKEVE I \ AYSDVAKRLHQVKSRLLKEVS 
FMSSP 


5444 


2 


344 


S G P I G VTGAQMAKWIiRD YLS FGGRR P P PQ P P TPD YTES D I LRA Y 
RAQKNLDFEDPY*DSESRLEPDPAGPGDSKNPGDAKYGSPKHRL 
IKVEAADMARAKALLGGPGEELEADTEYLDPFDAQPHPAPPDDG 
YMEPYDAQWVMSELPGRGVQLYDTPYEEQDPETADGPPSGQKPR 
QS RM P QEDE R PADE YDQP WEW KKDH I S RAFAVQ FDS PE WERTPG 
SAKELRRPPPRS PQPAERVDPALPLEKQPWFHGPLNRADAESLL 
SLCKEGSYLVRLSETNPQDCSLSLRSSQGFLHLKFARTRENQW 
LGQHSGPFPSVPE£VIjHYSSRPLPVQGAEHLALLYPWTQTP*Q 
*PDWGDRRPNGQVATGLPELWGAEAPSAAAHPGLHRERHPEGLP 
RAEKPGLRGPLLGLREPLGAGPRGPWGLQEPRRCQVWFSQAPAH 
Q GGG CG YG QSQGPSGRP RGG AG S RH 


5445 


2364 


486 


ILSRGFLGSVEICIQLPLPASEPVLLLTWARRRWRETRSRREPT 
TLRAQSVCPWWI*ETRMNRSIPVEVDESEPYPSQLLKPIPEYSP 
EEESEPPAPNIRNMAPNSLSAPTMLHNSSGDFSQAHSTLKLANH 
QRPVSRQVTCLRTQVLEDSEDSFCRRHPGLGKAFPSGCSAVSEP 
ASESWGALPAEHQFSFMEKRNQWLVSQLSAASPDTGHDSDKSD 
QSLPNASADSLGGSQEMVQRPQPHRNRAGLDLPTIDTGYDSQPQ 
DVLGIRQLERPLPLTSVCYPQDLPRPLRSREFPQFEPQRYPACA 
QMLPPNLSPHAPWNYHYHCPGSPDHQVPYGHDYPRAAYQQVIQP 
ALPGQPLPGASVRGLHPVQKVILNYPSPWDQEERPAQRDCSFPG 
LPRHQDQPHHQPPNRAGAPGESLECPAELRPQVPQPPSPAAVPR 
PPSNPPARGTLKTSNLPEELRKVFITYSMDTAMEWKFVNFLLV 
NG FQTA ID I FEDR I RG I D 1 1 KWMER YLRD KT VM 1 1 VA ISP K YKQ 
DVEGAESQLDEDEHGLHTKYIHRMMQIEFIKQGSMNFRFIPVLF 
PNAKKEH V PT WLQNTHVYS W PKNKKN I LLRLLR E E E YVAP P RG P 
LPTLQWPL 


5446 


972 


161 


S S WS WCTGRMR KTRLWGLL WM L F VS E LRAAT KLTE EK YE L KEGQ 
TLD VKCD YTLE KFAS S QKAWQI I RDGEMPKTLACTER PS KNS H P 
VQVGRIILEDYHDHGLLRVRMVNLQVEDSGLYQCVIYQPPKEPH 
MLFDRIRLWTKGFSGTPGSNENSTQNVYKIPPTTTKALCPLYT 
TPRTVTQAPPKS TADVSTPDS EINLTNVTDI IRVPVFNI VI LLA 
GGFLSKSLVFSVLPAVTLRSFVP*AHEPTRMSSDFQPHPSGSCA 
KGGGRR 


5447 


207 


617 


MTARTLS LMAS L VAYDDS DS EAETEHAGSFNATGQQKDTSG VAR 
PPGQDFASGTLDVPKAGAQPTKHGSCEDPGGYRLPLAQLGRSDR 
GSCPSQRLQWPGKEPQVTFPIKEPSCSSLWTSHVPASHMPLAAA 
RFKQVKLSRNFPKSSFHAQSESETVGKNGSSFQKKKCEDCWPY 
TPRRLRQRQALS TETGKGKD VEPQGPPAGRAPAPL YVGPG VS EF 
IQ P YLNSHYKE TT VPRKVL FHLRGHRGP VNT I QW CP VLS KSHML 
LSTSMDKTFKVWNAVDSGHCLQTYSLHTBAVRAARWAPCGRRIL 
SGGFDFALHLTDLETGTQLFSGRSDFRITTLKFHPKDHNIFLCG 
GFSSEMKAWDIRTGKVMRSYKATIQQTLDILFLREGSEFLSSTD 
ASTRDS ADRTI I AWDFRTSAKI SNQ I FHERFTCPSLALHPRE PV 
FLAQTNGNYLALFSTVWPYRMSRRRRYEGHKVEGYSVGCECSPG 
GDLLVTGS ADGRVLMYS FRTAS RACTLQGHTQACVGTTYHP VLP 
SVLATCSWGGDMKIWH*AFHWLSLGEAIGDLAPARGYSGPGRSL 
KS PS PS KS LLVLLCGRAMFQ PATCPWQLPALSK 


544B 


194 


1833 


MASKVTDAIVW YQKKIGAYDQQI WEKSVEQRE I KGLRNKPKKTA 
HVKPDLIDVDLVRGSAFAKAKPESPWTSLTTKGIVRWFFPFFF 
RWWLQVTS KVI FFWLLVL YLLQVAAI VLFCSTS S PHS I PLTE VI 
G P I WLMLLLGTVH CQ I VS TRTP KP P LS TGGKRR R KLRKAAHL E V 
HREGDGS S TTDNTQEGAVQNHGTSTSHS VGTVFRDLWHAAFFLS 
GSKKAKNSIDKSTETDNGYVSLDGKKTVKSGEDGIQNHEPQCET 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, Ec 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine , K=Lysine, 
L= Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VssValine, 
W=Tryptophan, Y=Tyrosine, X=Unknovn, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








IRPEETAWNTGTLRNGPSKOTQRTITNVSDEVSSEEGPETGYSL 
RRHVDRTSEGVLRNRKSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
QDSESARPESETEDVLWEDLLHCAECHSSCTSETDVENHQINPC 
VKKEYRDDPFHQSHLPWLHSSHPGLEKISAIVWEGNDCKKADMS 
VLEISGMIMNRVNSHIPGIGYQIFGNAVSLILGLTPFVFRLSQA 
TDLEQLTAHSASELYVIAFGSNEDVIVLSMVIISFVA/RVSLVWI 
FF F LL C VAERT Y KQ VG I M * T S EG VLRNR KS HHY KKHY PNE DAP K 
SGTSCSSRCSSSRQDSESARPESETEDVLWEDLLHCAECHSSCT 
SETDVENHQINPCVKKEYRDDPFHQSHLPWLHSSHPGLEKISAI 
VWEGNDCKKADMSVLEISGMIMNRVNSHIPGIGYQIFGNAVSLI 
LGLTPFVFRLSQATDLEQLTAHSASELYVIAFGSNEDVIVLSMV 
IIS FWR VSLVWI FFFLLC VAERTYKQVGIM 


5449 


194 


1833 


MAS KVTDAI VW YQKKIGA YDQQ I WEKS VEQRE I KGLRNKP KKTA 
HVKPDL I DVDLVRGS AFAKAKPES PWTSLTTKG I VRWFFP FFF 
R WWLQ VTS KV I FF WLL VL YL LQ VAA I VL FCS TS S PHS I P LTE V I 
GPIWLMLLLGTVHCQIVSTRTPKPPLSTGGKRRRKLRKAAHLEV 
HREGDGSSTTDNTQEGAVQNHGTSTSHSVGTVFRDLWHAAFFLS 
GSKKAKNSIDKSTETDNGYVSLDGKKTVKSGEDGIQNHEPQCET 
IRPEETAWNTGTLRNGPSKDTQRTITNVSDEVSSEEGPETGYSL 
RRHVDRTSEGVLRNRECSHHYKKHYPNEDAPKSGTSCSSRCSSSR 
ODSESARPESETEDVLWEDljLHC!AEnHQ^CT<5RTnVPKJMnTMDr ' 

VKKEYRDDPFHQSHLPWLHSSHPGLEKISAIVWEGNDCKKADMS 
VLE I SGM IMNR WSH I PGI G YQI FGNAVSL ILGLTPFVFRLSQA 
TDLEQLTAHSASELYVIAFGSNEDVIVLSMVIISFVVRVSLVWI 
FFFLLCVAERTYKQVGIM*TSEGVLRNRKSHHYKKHYPNEDAPK 
SGTSCSSRCSSSRQDSESARPESETEDVLWEDLLHCAECHSSCT 
SETDVENHQINPCVKKEYRDDPFHQSHLPWLHSSHPGLEKISAI 
WJEGNDCKKADMS VLE I SGMI MNRVNS H I PGIG YQI FGNAVSL I 
LGLTPFVFRLSQATDLEQLTAHSASELYVIAFGSNEDVIVLSMV 
I ISFWRVSLVWIFFFLLCVAERTYKQVGIM 


5450 


813* 


1242 


GQQFASFFG*NHPEVTVAMALTDIDLQLQFSMSQPEALLLLAAG 
PADHLLLQLYSGHLQVRLVLGQEELRLQT PAETLLS DS I PHTW 
LTWEGWATLSVDGFLNASSAVPGAPLEVPYGLFVGGTGTLGLP 
YLRGTSRPLRGCLHAATLNGRSLLRPLTPDVHEGCAEEFSASDD 
VALGFSGPHSLAAFPAWGTQDEGTLEFTLTTQSRQAPLAFQAGG 
RRGDFIYVDIFEGHLRAVVEKGQGTVLLHNSVPVADGQPHEVSV 
HINAHRLEISVDQYPTHTSNRGVLSYLEPRGSLLLGGLDAEASR 
HLQEHRLGLTPEATNASLLGCMEDLSVNGQRRGLREALLTRNMA 
AGCRLEEEEYEDDAYGHYEAFSTLAPEAWPAMELPEPCVPEPGL 
PP VFANFTQLLT I S PL WAEGGTAWLEWRHVQPTLDLMEAELRK 
SQ VLFS VTRGAH YGELELD I LGAQAR KM FTLLDWNR KAR F I HD 
GSEOTSDQLVLEVSVTARVPMPSCLRRGQTYLLPIQVNPVNDPP 
H I I FPHGS LM V ILEHTQKPLGP E VFQA Y D P DSACEGLT FQ VLGT 
SSGLPVERRDQPGEPATEFSCRELEAGSLVYVHCGGPAQDLTFR 
VS DGLQAS P PATL KWAI RPAI Q I HRS TG LRLAQG SAMP I LP AN 
LSVETNAVGQDVSVLFRVTGALQFGELQKHSTGGVEGAEWWATQ 
AFIlQRDVEQGRWYLSTDPQHHAYDTVENIiALEVQVGQEILSNXi 
S FPVTI QRATVWMLRLEPLHTQNTQQETLTTAHLEATLEEAGPS 
P PTFH YE WQAPRKGNLQLQGTRLSDGQGFTQDD I QAGRVT YGA 
TARASEAVEDTFRFRVTAPP Y FS PL YT FP IH IGGDPDAP VLTNV 
LL WPEGGEG VLS ADHL FVKS LNS AS YL YEVMERPRLGRLAWRG 
TQDKTTMVTS FTNEDLLRGRLVYQHDDSETTEDDI PFVATRQGE 
SSGDMAWEEVRGVFRVAIQPVNDHAPVQTISRIFHVARGGRRLL 
TTDDVAFSDADSGFADAQLVLTRKDLLFGSIVAVDEPTRPIYRF 
TQEDLRKRR VLFVHSGADRGW I QLQVSDGQHQATALLE VQAS E P 
YLRVANGSSLWPQGGQGTIDTAVLHLDTNLDIRSGDEVHYHVT 
AG PRWGQLVRAGQ P AT A FSQQD L LDG AVL YS HNGS LS P E DTMAF 
S VE AGP VHTDATLQ VT I ALEG P LAPLKL VRHKK I YVFQG EAAE I 
RRDQLEAAQEAVPPADI VPS VKS PPSAGYLVMVSRGA1ADE PPS 
LDPVQSFSQEAVDTGRVLYLHSRPEAWSDAFSLDVASGLGAPLE 
GVLVELEVLPAAIPLEAQNFSVPEGGSLTLAPPLLRVSGPYFPT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corres pond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid ft pennant* c on f a "i n i nn qi nnal nont-iHa 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*Lyoine, 
L=Leucine, M»Methionine, N»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LLGLSLQVLEPPQHGPLQKEDGPQARTLSAFSWRMVEEQLIRYV 
HDG S ETLTDS FVLMANAS EMDRQSHP VAFT VTVIiP VNDQP P I LT 
TNTGLQMWEGATAPIPAEALRSTDGDSGSEDIiVYTIEQPSNGRV 
VLRGAPGTEVRSFTQAQLDGGLVLFSHRGTLDGGFPFRLSDGEH 
TSPGHFFRVTAQKQVLLSLKGSQTLTVCPGSVQPLSSQTLRASS 
SAGTDPQLLLYRWRGPQLGRLFHAQQDSTGEALVNFTQAEVYA 
GNI LYEHEMPPEPFWEAHDTLELQLSS PPARDVAATLAVAVSFE 
AACPQR PSHL WKNKGLWVPEGQRAR I TVAALDAS NTJuAS VPS PQ 
R S E HD VLFQ VTQ F PSRGQLLVS E E PLHAGQ PHFLQSQ LAAGQ LV 
YAHGGGGTQQDGFHFRAHLQGPAGASVAGPQTSEAFAITVRDVN 
ERPPQPQASVPLRLTRGSRAPISRAQLSWDPDSAPGEIEYEVQ 
RAPHNG FLSLVGGGLG P VTRFTQADVDSGRLAFVANGSS VAGI F 

SQ QQLR WSDR E E P EAA YRL I QG P Q YGHLL VGGR PTSAFS QFQ I 
DQGEWFAFTNFSSSHDHFRVLALARGVNASAVVNVTVRALLHV 
WAGGPWPQGATLRLDPTVLDAGELANRTGSVPRFRLLEGPRHGR 
WRVPRARTEPGGSQLVEQFTQQDLEDGRLGLEVGRPEGRAPGP 
AGDS LTLELWAQGVPPAVAS LDFATE P YNAARP YSVALLS VPEA 
ARTEAGKPESS TPTGEPG PMASS PEPAVAKGGFLS FLEANMFS V 
1 1 PMCLVLLLLALI LPLLFYLRKRNKTGKHDVQVbTAKPRNGLA 
GDTETFRKVEPGQAIPLTAVPGQGPPPGGQPDPELLQFCRTPNP 
ALKNGQYWV 


5451 


1 


2274 


RDSSEQGRTGDTLGRPSACMDALKPPCLWRNHERGKKDRDSCGR 
KNSEPGSPHSLEALRDAAPSQGLNFLLLPTKMLFIFNFLFSPLP 
TPALICILTFGAAIFLWLITRPQPVLPLLDLNNQSVGIEGGARK 
GVSQKNNDLTSCCFSDAKTMYEVFQRGLAVSDNGPCliGYRKPNQ 
PYRWLSYKQVSDRAEYLGSCLLHKGYKSSPDQFVGIFAQNRPEW 
IISELACYTYSMVAVPLYDTLGPEAIVHIVNKADIAMVICDTPQ 
KAL VL I GNVE KG FT P S LKV 1 1 LMDP FDDDLKQRGE KSGIEILSL 
YDAENLGKEHFRKPVPPS PEDLS VI CFTSGTTGDPKGAMITHQN 
I VSNAAAFLKC VEHAYE PT P DD VA TSYTiPT iAUMFTTP T uniUWQ 

CGARVGFFQGDIRLLADDMKTLKPTLFPAVPRLLNRIYDKVQNE 
AKTPLKKFLLKLAVSSKFKELQKGI I RHDSFWDKLI FAK IQDSh 
GGRVRVIVTGAAPMSTSVMTFFRAAMGCQVYEAYGQTECTGGCT 
FTLPGDWTSGHVGVPLACNYVKIiEDVADMNYFTVNNEGEVClKG 
TNVFKG YLKDPEKTQEALDSDGWLHTGD IGRWLPNGTLKI I DRK 
KN I F KLAOGE Y I APE KI EN I YNRSOPVT.O T PUHR T? Q T .P Q Q T VP \7 
WPDTD VL P S FAAKLG VKG S F EELCQNQ WREA I L EDLQ K I G KE 
SGLKTFEQVKAIFLHPEPFSIENGLLTPTLKAKRGELSKYFRTQ 
IDSLYEHIQD 


5452 


1833 


1138 


SRVPSLCLSLSLSLSPSREPVAGAPGCGTAGPPAMATLWGGLLR " 
LGSLLS hS CIjALSVLLLAQIjSDAAKNFEDVRCKCI CP PYKENSG 
HIYNKNISQKDCDCLHWEPMPVRGPDVEAYCLRCECKYEERSS 
VTIKVTIIIYLSILGLLLLYMVYLTLVEPILKRRLFGHAQLIQS 
DDD IG DHQ P FANAHD VLAR S R S RANVLNKVE YAQQRW KLQ VQ E Q 
RKSVFDRHWLS 


5453 


111 


1520 


PS I PAAVP QS AP PE PH REET VTATATS Q VAQQPPAAAAPGEQAV 

PQEERSQQQDDIEELETKAVGMSNDGRFLKFDIEIGRGSFKTVY 
KGLDTETTVEVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPNI 
VRFYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRFKVMKIKVL 
RSWCRQILKGLQFLHTRTPPIIHRDLKCDNIFITCPTGSVKIGD 
LGLATLKRASFAKSVIGTPEFMAPEMYEEKYDESVDVYAFGMCM 
L E MATS E Y P Y S E CQNAAQ I YRRVTSGVKPAS FDKVA I P E VKE 1 1 
EGCIRQNKDERYSIKDLLNHAFFQEETGVRVEIABEDDGEKIAI 
KLWLRIEDIKKLKGKYKDNEAIEFSFDLERNVPEDVAQEMVESG 
YVCEGDHKTMAKAI KDRVSLIKRKREQRQL* 


5454 


111 


1520 


PS I PAA V P QS AP PE PHREETVTATATS Q VAQQP PAAAAPGEQAV 
AGPAPSTVPSSTSKDRPVSQPSLVGSKEEPPPARSGSGGGSAKE 
PQEERSQQQDDIEELETKAVGMSNDGRFLKFDIEIGRGSFKTVY 
KGLDTETTVEVAWCELQDRKLTKSERQRFKEEAEMLKGLQHPNI 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino ari R£>cfTnpnl" on t~ a i ni r\n oirmaT monf <i 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L«Leucine, M«=Methionine , tf=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=»Tryptophan, YwTyrosine, X=Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VRFYDSWESTVKGKKCIVLVTELMTSGTLKTYLKRFKVMKIKVL ' 

RSWCRQILKGLQFLHTRTPPIIHRDLKCDNIFITGPTGSVKIGD 

LGLATLKRASFAKSVIGTPEFMAPEMYEEKYDESVDVYAFGMCM 

LEMATSEYPYSECQNAAQIYRRVTSGVKPASFDKVA1PEVKEII 

EGCIRQNKDERYSIKDLLNHAFFQEETGVRVELAEEDDGEKIAI 

KLWLRIEDIKKLKGKYKDNEAIEFSFDLERNVPEDVAQEMVESG 

YVCEGDHKTMAKAIKDRVSLIKRKREQRQL * 


545"5 " 


1359 


377 


LTMVS PATRKS LPKVKAMDFITSTAI LPLLFGCLGVFGLFRLLQ 
WVRGKAYLRNAVWITGATSGLGKECAKVFYAAGAKLVLCGRNG 
GALEELIRELTASHATKVQTHKPYLVTFDLTDSGAIVAAAAEIL 
• Q C FG Y VD I L VNN AG I S YRGT I MDTTVDVDKRVME TNY FG P VALT 
KALLPSM IKRRQGHI VAISSIQGKMS I PFRSA YAASKHATQAFP 
DCLRAEMEQYE IEVTVI S PGYIHTNLSVNAITADGSRYGVMDTT 
TAQGRS P VE VAQD VLAAVGKKKKD VI LADLLPSLAVYLRTLAPG 
LFFSLMASRARKERKSKNS 


5456 


2 


2332 


CGAGLVAAGAVLVLYPASRAGERTRVPGS PAPSSLPLHS PGACG " 

TEVDMDPQRSPIiLEVKGNIELKRPLIKAPSQLPLSGSRLKRRPD 

QMEDGLEPEKKRTRGLGATTKITTSHPRVPSLTTVPQTQGQTTA 

OKVSKKTGPRCSTAIATGLKNQKPVPAVPVQKSGTSGVPPMAGG 

KKPSKRPAWDLKGQLCDLNAELKRCRERTQTLDQENQQLQDQLR 

DAQQQVKALGTERTTLEGHIiAKVQAQAEQGQQELKNLRACVLEL 

EERLSTQEGLVQELQKKQVELQEERRGLMSQLEEKERRLQTSEA 

ALSSSQAEVASLRQETVAQAALLTEREERLHGLEMERRRLHNQL 

OELKGNTRVFCRVRPVtiPOFPTPPPnT.TiT i?P<5r:Dr'PDc:n'DT3TT3T 

SLSRSDERRGTLSGAPAPPTRHDFSFDRVFPPGSGQDEVFEEIA 
MLVQSALDGY P VCI FAYGQTGSGKTFTMEGGPGGDPQLEGL I PR 
ALRHliFSVAQELSGQGWTYS FVASYVE I YNETVRDLLATGTRKG 
QGGECE I RRAGPGS EELT VTNARYVP VS CE KEVDALLHLARQNR 
AVARTAQNER SSRSHSVFQLQISGEHSSRGLQ CGAPLS LVDLAG 
SERLDPGLALGPGERERLRETQAINSSLSTLGLVIMALSNKESH 
VP YRNS KLT YLLQNSLGGSAKMLM F VN I SPLEENVS ES LNS LR F 
ASKVEPSVLFGTAQSNRKWKTDPDLCVCVCVCVCVCVCVCVCVP 
MSMYRVRGGRVAGGCFIGWRAPCPRAIK 


5457 


2 


1540 


DDFVERRRWTRTTCLVRSPPHVPVCGHACSWNGGSLDPtKGTPA 
LLRSAERLMRKVKKLRLDKENTGSWRSFSLNSEGAERMATTGTP 
TADRGDAAATDDPAARFOVOKHHSWnfiT ,PQT THfiQP VVGHT Tincrv 

APHDFQFVQKTDESGPHSHRLYYLGMPYGSRENSLLYSEIPKKV 
RKEALLLLSWKQMLDHFQATPHHGVYSREEELLRERKRLGVFGI 
TSYDFHSESGLFLFQASNSLFHCRDGGKNGFMVSPGPGCVSPMK 
PLEIKTQCSGPRMDPKICPADPAEFSFINNSDLWVANIETGEER 
RLTFCHQGLSNVLDDPKSAGVATFV1QEEFDRFTGYWWCPTASW 
EGSEGLKTLRILYEEVDESEVEVIHVPSPALEERKTDSYRYPRT 
GSKNPKIALKLAEFQTDSQGKIVSTQEKELVQPFSSLFPKVEYI 
ARAGWTRDGKYAWAMFLDRPQQWLQLVLLPPALFIPSTENEEQA 
ASLCQS CPQECPAVCGVRGGHQRLDQCS 


5458 


6642 


4022 


FVPGLREPQWEPAQPSATMSAPSEEEEYARLVMEAQPEWLRAEV 
KRLSHELAETTREKIQAAEYGLAVLEEKHQLKLQFEELEVDYEA 
IRSEMEQLKEAFGQAHTNHKKVAADGESREESLIQESASKEQYY 
VRKVLELQTELKQLRNVLTNTQSENERLASVAQELKEINQNVEI 
QRGRLRDDIKEYKFREARLLQDYSELEEENISLQKQVSVLRQNQ 
VEFEGLKHEIKRLEEETEYLNSQLEDAIRLKEISERQLEEALET 
LKTEREQKNSLRKELSHYMS INDS F YTSHLHVSLDGLKFSDDAA 
EPNNDAEALVNGFEHGGLAKLPLDNKTSTPKKEGLAPPSPSLVS 
DLLSELNISEIQKLKQQLMQMEREKAGLLATLQDTQKQLEHTRG 
SLSEQQEKVTRLTENLSALRRLQASKERQTALDNEKDRDSHEDG 
DYYEVDINGPEILACKYHVAVAEAGELREQLKALRSTHEAREAQ 
HAEEKGRYEAEGQALTEKVSLLEKASRQDRELLARLEKELKXVS 
DVAGETQGSLSVAQDELVTFSEELANLYHHVCMCNNETPNRVML 
D YYREGQGGAGRTS PGGRTS PEARGRRS P I LLPKGLLAPEAGRA 
DGGTGDSSPSPGSSLPSPLSDPRREPMNIYNL1AIIRDQIKHLQ 
AAVDRTTELSRQRIASQELGPAVDKDKEALMEEILKLKSLLSTK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A*=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T«=Threonine , V^Valine, 
W«Tryptophan, Y=Tyrosine, X=Unk.nown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








REQITTLRTVLKANKQTAEVALANLKS KYENEKAMVTETMMKLR 
NELKALKEDAATFSSLRAMFATRCDEYITQLDEMQRQLAAAEDE 
KKT LNS LLRMAI QQ KLALTQRL ELLEL DHE QTRRG RAKAAP KTK 
PATPSVSHTCACASDRAEGTGLANQVFCSEKHS I YCD 


5459 


316 


1262 


RGGHRLSGMASNFNDIVKQGYVRIRSRRLGIYQRCWLVFKKASS 
KGPKRLEKFSDERAAYFRCYHKVTELNNVKNVARLPKSTKKHAI 
GIYFNDDTSKTFACESDLEADEWCKVLQMECVGTRINDISLGEP 
DLLATG VE RE Q S ER FNVY LM P S PNLG C YMGE CALQ I T YE YI CL W 
DVQNPRVKLISWPLSALRRYGRDTTWFTFEAGRMCETGEGLFIF 
QTRDG E A I YQ KVHS AALA I AE QHE RLLQS VKNS MLQM KM S ERAA 
SLSTMVPLPRSAYWQHITRQHSTGQLYRLQDVSSPLKLHRTETF 
PAYRSEH 


5460 


45 


2097 


R PGCRAGELSTGS RARER VRNRVSAPCGQDSRRCDPBVLRGRSP 
GLGLAEMP S CGACTCGAAAVRL I TS S LASAQRGI SGGR I HMS VL 
G R LGT F ETQ 1 LQRA PLR S FTE T PAY FAS KDG I S KDGS GDGNKKS 
ASEGSSKKSGSGNSGKGGNQLRCPKCGDLCTHVETFVSSTRFVK 
CEKCHHFFWLSEADSKKSIIKEPESAAEAVKLAFQQKPPPPPK 
KI YNYLDK Y WGQS FAKKVLS VAV YNHYKR I YNNI PANLR QQAE 
VEKQTSLTPRELEIRRREDEYRFTKLLQIAGISPHGNALGASMQ 
QQ VNQQ I PQEKRGGE VLDS SHDDI KLEKSNI LLLGPTGSGKTLL 
AQTLAKCLD VP FAI CDCTTLTQAG YVGEDI ES V I AKLLQDANYN 
VEKAQQGIVFLDEVDKIGSVPGIHQLRDVGGEGVQQGLLKLLEG 
T I VNVP EKN S R KLRG ET VQ VDTTN I L FVAS G AFNGLDR 1 1 S RRK 
N E KYLG FGT P SNLGKGR RAAAAADLANRS G E SNTHQD I E E KDRL 
LRH VE ARDL I E FGM I P E F VGRL P VWPLHS LDE KTLVQ I LTEPR 
NAVIPQYQALFSMDKCELNVTEDAliKAIARLALERKTGARGLRS 
IMEKLLLEPMFEVPNSDIVCVEVDKEWEGKKEPGYIRAPTKES 
SEEEYDSGVEEEGWPRQADAANS 


54 61 


1461 


160 


INP P P P P KS P CG RAR KWRR RRR PG AP EAAVMEL P S GPG PERL FD 
SHRLPGDCFLLLVLLLYAP VGFCLLVLRLFLG I HVFLVS CALPD 
S VLRR F WRTMCAVLGLVARQEDSGLRDHS VRVL I SNHVT P FDH 
NIVNLLTTCSTPLLNSPPSFVCWSRGFMEMNGRGELVESLKRFC 
AS TRL P P TPLL L F P EE EATNGREG LliRFSS WP FS I QDWQP LTL 
Q VQRPL VS VT VS DASWVS ELL WS L F VP FTVYQ VRWLRP VHRQLG 
EANEE FALR VQQ L VAKE LG QTGTRLT PADKAEHM KRQRH PRLR P 
QSAQSSFPPSPqPSPDVQLATLAQRVKEVLPHVPLGVIQRDLAK 
TGCVDLTITNLLEGAVAFMPEDITKGTQSLPTASASKFPSSGPV 
TPQ P TALTFAK S S WARQ E S LQ ER KQAL YE YARRR FT ERRAQ E AD 


5462 


663 


3353 


KIKERQMSANNSPPSAQKSVLPTAIPAVLPAASPCSSPKTGLSA 
RLSNGS FSAPSLTNS RGSVHTVS FLLQ I GLTRE S VT I EAQELSL 
S AVKDL VCS I V YQKFPE CG FFGM YDK I LLFRHDMNSEN I LQL I T 
SADEIHEGDLVEWLSALATVEDFQIRPHTLYVHSYKAPTFCDY 
CGEMLmhVRQGLKCEGCGLNYHKRCAFKIPNNCSGVRKRRLSN 
VS LPGPGLS VPRPLQPE YVALPSEESHVHQEPS KR I PSWSGRP I 
WMEKMVMCRVKVPHTFAVHSYTRPTICQYCKRLLKGLFRQGMQC 
KDCKFNCHKRCASKVPRDCLGEVTFNGEPSSLGTDTDIPMDIDN 
NDINSDSSRGLDDTEEPSPPEDKMFFLDPSDLDVERDEEAVKTI 
SPSTSTOIPLMRWQSIKHTKRKSSTMVXEGWMVIIYTSRDNLRK 
RHYWRLDSKCLTLFQNESGSKYYKEIPLSEILRISSPRDFTNIS 
QG SNPH CFE 1 1 TDTM VY FVGENNGDS SHNP VLAATG VGLD VAQS 
WBKAIRQALMPVTPQASVCTSPGQGKDHKDLSTSISVSNCQIQE 
NVDISTVYQ1FADEVLGSGQFGIVYGGKHRKTGRDVAIKVIDKM 
RFPTKQESQLRKEVAILQNLHHPGIWLECMFETPERVFWMEK 
LHGDMLEMILSSEKSRLPERITKFMVTQILVALRNLHFKNIVHC 
DLKPENVLLASAEPFPQVKLCDFGFARIIGEKSFRRSWGTPAY 
LAPEVLRSKGYNRSLDMWSVGVIIYVSLSGTFPFNEDEDINDQI 
QNAAFM YP PNP WR E I S GEA I DL I NNLLQ VKMR KR YS VDKS LSH P 
WLQDYQTWLDLREFETRIGERYITHESDDARWEIHAYTHNLVYP 
KHFIMAPNPDDMEEDP 


5463 


237 


1012 


LLSVTMTTSRCSHLPEVLPDCTSSAAPWKTVEDCGSLVNGQPQ 
YVMQVSAKDGQLLSTWRTLATQSPFNDRPMCRICHEGSSQEDL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, [^Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LSPCECTGTLGTIHRSCLEHWLSSSNTSYCBIjCHFRFAVERKPR 
P L VE WLRN PGPQHE KRTLFGDM VCFL F I TPLAT I S G WL CLRGAV 
DHLHFSSRLEAVGLIALTVALFTIYLFWTLVSFRYHCRLYNBWR 
RTNQRVILLIPKSVNVPSNQPSLLGLHSVKRNSKETW 


5464 


195 


677 


SPSMNPRKKVD^KIjIIVGAIGVGKTSLLHQYVHKTFYEEYQTTL 
GAS ILSKIII LGDTTLKLQ I WDTGGQERVRSMVSTFY KGSDGC I 
LAFDVTDLESFEALDIWRGDVLAKIVPMEQSYPMVLLGNKIDLA 
DRKYQSILENHLTESIKLSPDQSRSRCC 


5465 


5278 


3348 


KGDPREFIRVHREALECDYVSAHLHEWIDLIFGYKQQGPAAVEA 
VNVFHHLF YEGQ VDI YNI NDPLKETAT IGF INNFGQ I PKQLFKK 
PHPPKRVRSRLNGDNAGISVLPGSTSDKIFFHHLDNLRPSLTPV 
KELKEPVGQIVCTDKGIIjAVEQNKVLIPPrWNKTFAWGYADLSC 
RLGTYESDKAMTVYECLSEWGQILCAICPNPKLVITGGTSTWC 
VWEMGTS KE KAKTVTLKQALLGHTDTVTCATAS LAYH 1 1 VSGSR 
DR T C 1 1 WDLNKLS FLTQLRGHRA P VS ALC I NELTGD I VS CAGT Y 
IHVWS INGNP I VS VNT FTGRS QQ 1 1 CCCMS EMNE WDTQN V I VTG 
HSDGVVRFWRMEFLQVPETPAPEPAEVLEMQEDCPEAQIGQEAQ 
DEDSSDSEADEQSISQDPKDTPSQPSSTSHRPRAASCRATAAWC 
TDS G S DDS RRW S DQ LS LDE KDG F I FVN Y S EGQTRAH LQG P LSH P 
H PN P I E VRN Y S R tiKPG YRW ERQ L V FR S KLTMHT AFDRKDNAHP A 
EVTALGISKDHSRILVGDSRGRVFSWSVSDQPGRSAADHWVKDE 
GGDSCSGCSVRFSLTERRHHCRNCGQLFCQKCSRFQSEIKRLKI 
SSPVRVCQNCYYNLQHERGSEDGPRNC 


5466 


3 


992 


HACAHASAHASGRLVRWWRKRRSVMGIQTSPVLLASLGVGLVTL 
LGLAVGSYLVRRSRRPQVTLLDPNEKYLLRLLDKTTVSHNTKRF 
RFALPTAHHTLGLPVGKHIYLSTRIDGSLVIRPYTPVTSDEDQG 
YVDLVIKVYLKGVHPKFPEGGKMSQYLDSLKVGDWEFRGPSGL 
LTYTGKGHFNIQPNKKSPPEPRVAKKLGMIAGGTGITPMLQLIR 
AILKVPEDPTQCFLLFANQTEKDIILREDLEELQARYPNRFKLW 
FTLDHP PKD WAYS KG FVTADM I REHLPAPGDDVL VLLCG P P PMV 
QLACHPNLDKLGYSQKMRFTY 


"■ 5467 " 


2103 


4 


GEALRVGTRGCRRDLPDPQARI F I QKKDLEEDES VTAAHLKSRG 
RSPRKIDQFCNSSNMVHGSVTFRDVAIDFSQEEWECLQPDQRTL 
YRDVMLENYSHLI SLAGSS ISKPDVI TLLEQEKEPWMWRKETS 
RRYPDLELKYGPEKVSPENDTSEVNLPKQVIKQISTTLGIEAFY 
FRNDSEYRQFEGLQGYQEGNINQKMISYEKLPTHTPHASLICNT 
HKPYECKECGKYFSCGSNLIQHQSIHTGEKPYKCKECGKAFQLH 
IQLTRHQKFHTGEKTFECKECGKAFNLPTQLNRHKNIHTVKKLF 
ECKECGKSFNRSSNLTQHQSIHAGVKPYQCKECGKAFNRGSNLI 
QHQKIHSNEKPFVCKECGMAFRYHYQLIEHCQIHTGEKPFECKE 
CGKAFTLLTKLVRHQKIHTGEKPFECRECGKAFSLLNQLNRHKN 
IHTGEKPFECKECGKSFNRSSNLVQHQSIHAGIKPYECKECGKG 
FNRGAHLIQHQKIHSNEKPFVCRECEMAFRYHCQLIEHSRIHTG 
DKPFECQDCGKAFNRGSSLVQHQSIHTGEKPYECKECGKAFRLY 
LQLS QHQ KTHTGEKPFE C KE CGK F FRRG SNLNQHRS I HTGKKP F 
ECKECGKAFRLHMHLIRHQKLHTGEKPFECKECGKAFRLHMQLI 
RHQKLHTGEKPFECKECGKVFSLPTQLNRHKNIHTGEKAS 


5468 


225 


2976 


SFLTDLFQSLAQLENLCKQLYETTDTTTRLQAEKALVEFTNSPD 
CLS KCQLL LERG S S S YS QLIiAATCLT KLVS RTNN PL PLEQR I D I ' 
RNYVLN YLATRPKIiATFVTQALIQLYAR I TKLGWFDCQKDDYVF 
RNAITDVTRFLQDS VEYC I IGVTILSQLTNEINQVSATAFL I EA 
DTTHPLTKHRKIASSFRDSSLFDIFTLSCNLLKQASGKNLNLND 
ESQHGLLMQLLKLTHNCLNFDFIGTSTDESSDDLCTVQIPTSWR 
S AFLD S STLQLS T I G RCE Y E KTC ALLVQLFDQS AQS YQE LLQ3 A 
S AS PMD I AVQEGRLT WLVY 1 1 GAVIGGRVS FASTDEQDAMDGEL 
VCRVLQLMNLTDSRLAQAGNEKLELAMLSFFEOFRKIYIGDQVQ 
KSSKLYRRLSEVLGLNDETMVLSVFIGKIITNLKYWGRCEPITS 
KTLQLLNDLSIGYSSVRKLVKLSAVQFMLNNHTSEHFSFLGINN 
QSNLTDMRCRTTFYTALGRLLMVDLGEDEDQYEQFMLPLTAAFE 
AVAQM FSTNS FNEQEAKRTLVGLVRDLRGIAFAFNAKTS FMMLF 
E W I YPS YM P I LQRA I E LW YKD PACTT P VLKLMAELVHNRS QRLQ 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=:Isoleucine, K=Lysine, 
L=Leucine, M«Methionine, N«Asparagine, 
P=Proline, Q«Glutamine, R=Arginine, 
S«Serine, T^Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 
Codon r /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








FDVS S PNG I LLFRETS KM ITM YGNRI LTLGE VP KDQVYALKLKG 
I S I C FSMLKAALSGS YVNFGVFRL YGDDALDNALQTF I KLLLS I 
PHSDLLDYPKLSQSYYSLLEVLTQDHMNFIASLEPHVIMYILSS 
ISEGLTAIiDTMVCTGCCSCriDHIVTYLFKQLSRSTKKRTTPLNQ 
ESDRFLHIMQQHPEMIQQMLSTVTjNIIIFEDCRNQWSMSRPLLG 
LILLNEKYFSDLRNSIVNSQPPEKQQAMHLCFENLMEGIERNLL 
TKNRDRFTQNLSAFRREVNDSMKNSTYGVNSNDMMS 


5463 


134 


2653 


DQEFETSLVPWHLPMGWLCSGLLFPVSCLVLLQVASSGNMKVLQ 
E PTC VSDYMS I STCEW KMNGPTNCS TELRLLYQLVFLLS EAHTC 
VPENNGGAGCVCPTLLMDDWSADNYTLDLWAGQQLLWKGSFKPS 
EHVKPRAPGNLTVHTNVSDTLIiLTWSNPYPPDNYLYNHLTYAVN 
IWSENDPADFRIYNVTYLEPSLR1AASTLKSGISYRARVRAWAQ 
CYNTTWSEWSPSTKWHNSYREPFEQHLLLGVSVSCIVILAVCLL 
CYVSITKIKKEWWDQIPMPARSRLVAXIIQDAQGSQWEKRSRGQ 
E P AKC P HWKN CLT KLL P C FLEHNM KRDEDPH KAAKEM P FQGSGK 
SAW C P V E I S KT VL WP E S I S WR CVELFE AP VE C E E EEE VE E E KG 
S FCASPES SRDDFQEGREG I VARLTES LFLDLLGEENGG FCQQD 
MGESCLLPPSGSTSAHMPWDEFPSAGPKEAPPWGKEQPLHLEPS 
PPASPTQSPDNbTCTETPLVIAGNPAYRSFSNSLSQSPCPRELG 
PDPLLARHLEEVEPEMPCVPQIiSEPTTVPQPEPETWEQILRRNV 
LQHGAAAAPVSAPTSGYQEFVHAVEQGGTQASAWGLGPPGEAG 
YKAFSSLLASSAVSPEKCGFGASSGEEGYKPFQDLIPGCPGDPA 
PVPVPLFTFGLDREPPRSPQSSHLPSSSPEHLGLEPGEKVEDMP 
KPPLPQEQATDPLVDSLGSGIVYSALTCHLCGHLKQCHGQEDGG 
QTPVMASPCCGCCCGDRASPPTTPLRAPDPSPGGVPLEASLCPA 
SLAPSGISEKSKSSSSFHPAPGNAQSSSQTPKIVNFVSVGPTYM 
RVS 


5470 


17 


1418 


TACR IRTSLNRG I AAVKEDAVEMLAS YGLA YS LMKFFTGPMS DF 
KNVGLVFVNS KRDRTKAVLCMWAGAI AAVFHTL I AYSDLGY Y I 
INKLHHVDESVGSKTRRAFLYLAAFPFMDAMAWTHAGILLKHKY 
SFLVGCASISDVIAQWEVAILLHSHLECREPLIilPILSLYMGA 
LVRCTTLCLGYYKNIHDIIPDRSGPELGGDATIRKMLSFWWPLA 
LILATQRISRPIVNLFV3RDLGGSSAATEAVAILTATYPVGHMP 
YGWLTE I RAVYPAFDKNNPSNKLVS TSNTVTAAH I KKFT FVCMA 
LSLTLCFVMFWTPNVSEKI LIDI IGVDFAFAELCWPLR I FS FF 
PVPVTVRAHLTGWLMTLKKTFVLAPSSVLRI1VLIASLWLPYL 
GVHGATLGVGSLLAGFVGESTMDAIAACYVYRKQKKKMENESAT 
EGEDSAMTDMPPTEEVTDIVEMREENE 


5471 


1868 


658 


RSSAPPGPQRAAAATAAAAAAGVEMAAAAAQGGGGGEPRRTEGV 
GPGVPGEVEMVKGQPFDVGPRYTQLQYIGEGAYGMVSSAYDHVR 
KTRVAIKKISPFEHQTYCQRTLREIQILLRFRHENVIGIRDILR 
ASTLBAMRDVYI VQDLMETDL YKLLKS QQLSNDH I C Y FLYQ I LR 
GLKYIHSANVLHRDLKPSNLLINTTCDLKICDFGLARIADPEHD 
HTGFLTEYVATRWYRAPEIMLNSKGYTKSIDIWSVGCILAEMLS 
NRPIFPGKHYLDQLNHILGILGSPSQEDLNCIINMKARNYLQSL 
PSKTKVAWAKLFPKSDSKALDLLDRMLTFNPNKRITVEEALAHP 
YLEQYYDPTDEPVAEEPFTFAMELDDLPKERLKELIFQETARFQ 
PGVLEAP 


5472 


1469 


753 


LYVMARYLSDEEVAVS IDRLCKANGRS PS IPFGTVRIPGRARVR * 

DPQALWIFGYGSLVWRPDFAYSDSRVGFVRGYSRRFWQGDTFHR 

GSDKMPGRWTLLEDHEGCTWGVAYQVQGEQVSKALKYLNVREA 

VLGGYDTKEVTFYPQDAPDQPLKALAYVATPQNPGYLGPAPEEA 

IATQILACRGFSGHNLEYLLRVRDVMQLCGPQAQDEHLAAIVDA 

VGTMLPCFCPTEQALALV 


54 73 


3 


2119 


FMNVKLLIQDLEDIEQRVPVMDAQYKIITKTAHLITKESPQEEG 
KEMFATMSKLKEQLTKVKECYSPLLYESQQLLIPLBELEKQMTS 
FYDSLGKINEIITVLEREAQSSALFKQKHQELLACQENCKKTLT 
LIEKGSQSVQKFVTLSNVLKHFDQTRLQRQIADIHVAFQSMVKK 
TGDWKKHVETNSRLMKKFEESRAELEKVLRIAQEGLEEKGDPEE 
LLRRHTEFFSQLDQRVLNAFLKACDELTDILPEQEQQGLQEAVR 
KliHKQWKDLQGEAPYHLiLHLKIDVEKNRFLASAEECRTELDRET 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, F=Phenylalanine, G~Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine , N=Asparagine, 
P=»Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLMPQEGSEKIIKEHRVFFSDKGPHHLCEKRLQLIEELCVKLPV 
RDPVRDTPGTCHVTLXELRAAIDSTYRKLMEDPDKWKDYTSRFS 
E FS S W I STNETQLKG I KGEAIDTANHGEVKRAVE E IRNGVTKRG 
ETLSWLKSRLKVLTEVSSENEAQKQGDEIiAKLSSSFKALVTLLS 
EVEKMLSNFGDCVQYKEIVKNSLEELISGSKEVQEQAEKILDTE 
NLFEAQQLLLHHQQKTKRISAKKRDVQQQIAQAQQGEGGLPDRG 
HEELRKLESTLDGLERSRERQERRIQVTLRKWERFETNKETWR 
YLFQTGSSHERFLSFSSLESLSSELEQTKEFSKRTESIAVQAEN 
L VKEAS E I PLG PQNKQ L LQQQAKS I KEQ VKKL E DTLE E E YVI DK 
S 


5474 


2 


780 


TPDVRQLQASRRGIAVASWCSPRWFAGEEMAFVKSGWLLRQSTI 
LKRWKKNWFDLWSDGHLIYYDDQTRQNIEDKVHMPMDCINIRTG 
QECRDTQPPDGKSKDCMLQIVCKDGKTISLCAESTDDCLAWKFT 
LQDSRTNTAY VGSAVMTDETS WS S P P P YTAYAAPAPEVGRTLS 
LQQAYGYGPYGGAYPPGTQWYAANGQAYAVPYQYPYAGLYGQQ 
PANQVI I RER YRDNDSDLALGMLAGAATGMALGS LFWVF 


5475 


2 


506 


ARGWLESLSLTCQTTPPPSSPCLLHSPETFIHTMPPNLTGYYRF 
VSQKNMEDYLQALNISLAVRKIALLLKPDKEIEHQGNHMTVRTL 
STFRNYTVQFDVGVEFEBDLRSVDGRKCQTIVTWEEEHLVCVQK 
GEVPNRGWRHWLEGEMLYLELTARDAVCEQVFRKVR 


5476 


192 


1457 


SDSMSLLDCFCTSRTQVESLRPEKQSETSIHQYLVDEPTLSWSR 
PSTRASEVLCSTNVSHYELQVEIGRGFDNLTSVHLARHTPTGTL 
VTIKITNLENCNEERLKALQKAVILSHFFRHPNITTYWTVFTVG 
SWLWVISPFMAYGSASQLLRTYFPEGMSETLIRNILFGAVRGLN 
YLHQNGC I HRS I ICASHI LISGDGLVTLSGLSHLHSLVKHGQRHR 
AVYDFPQFSTSVQPWLSPELLRQDLHGYNVKSDIYSVGITACEL 
ASGQVP FQDMHRTQMLLQKLKGPPYS PLD I S I FPQSESRMKNSQ 
SGVDSGIGESVLVSSGTHTVNSDRLHTPSSKTFSPAFFSLVQLC 
LQQDPEKRPSASSLLSHVFFKQMKEESQDSILSLLPPAYNKPSI 
SLPPVLPWTEPECDFPDEKDSYWEF 


5477 


3 


1044 


RGNSRLRYSHEDELQLPRLPELFETGRQLLDEVEVATEPAGSRI 
VQEKVFKGLDLLEKAAEMLSQLDLFSRNEDLEEIASTDLKYLLV 
P AFQGALTM KQ VN PS KRLDHLQRAREH F I N YLTQ CHC YHVAE F E 
L.PKTMNWSAENHTANSSMAYPSLVAMASQRQAKIQRYKQKKELE 
HRLSAMKSAVESGQADDERVREYYLLHLQRWIDISLEEIESIDQ 
EIKILRERDSSREASTSNSSRQERPPVKPFILTRNMAQAKVFGA 
GYPSLPTMTVSDWYEQHRKYGALPDQGIAKAAPEEFRKAAQQQE 
EQEEKEEEDDEQTLHRAREWDDWKDTHPRGYGNRQNMG 


5478 


2 


835 


KTVR I WVPNVKG E S T VFRAKTAT VRS VH F CS DGQ S F VTAS DDKT ' 

VK VWATHRQKFL FS LS QH INW VR CAKFS PDGRL I VSASDDKTVK 

LWDKSSRECVHSYCEHGGFVTYVDFHPSGTCIAAAGMDNTVKVW 

DVRTHRLLQHYQLHS AAVNGLS FHPSGN YL I TASSDS TLKI LDL 

MEGRLLYTLHGHQGPATTVAFSRTGEYFASGGSDEQVMVWKSNF 

DIGDHGEVTKVPRPPATLASSMGNLTVSILEQRLTLEEDKLKQC 

LENQQLIMQRATP 


5479 


2 


835 


KTVRIWVPNVKGESTVFRAHTATVRSVHFCSDGQSF VTAS DDKT 
VKVWATHRQKFTjFSLSQHINWVRCAKFSPDGRLIVSASDDKTVK 
L WDKS SREC VHS Y CEHGG FVT YVDFHPS GTC I AAAGMDNTVKVW 
DVRTHRLLQHYQLHSAAVNGLSFHPSGNYLITASSDSTLKILDL 
MEGRLLYTLHGHQGPATTVAFSRTGEYFASGGSDEQVMVWKSNF 
DIGDHGEVTKVPRPPATIiASSMGNLTVSILEQRLTLEEDKLKQC 
LENQQLIMQRATP 


5480 


444 


1952 


LSLTSRMEEAELVKGRLQAITDKRKIQEEISQKRLKIEEDKLKH 
QHLKKKALREKWLLDGISSGKEQEEMKKQNQQDQHQIQVLEQSI 
LRLEKEIQDLEKAELQISTKEEAILKKLKSIERTTEDIIRSVKV 
EREERAEES I EDIYANI PDLPKS YI PSRLRKE I NEEKEDDEQNR 
KALYAMEIKVEKDLKTGESTVLSSIPLPSDDFKGTGIKVYDDGQ 
KSVYAVSSNHSAAYNGTDGLAPVEVEELLRQASERNSKSPTEYH 
EPVYANPFYRPTTPQRETVTPGPNFQERIKI KTNGLGIGVNES I 
HNMGNGLSEERGNNFNHI S PI PPVPHPRSVIQQAEEKLHTPQKR 
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ID 
NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«aAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S»Serine, T= Threonine, V=Valine, 
W tryptophan, Y=»Tyrosine, XsUnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMTPWEESNVMQDKDAPSPKPRLSPRBTIFGKSEHQNSSPTCQE 
DEEDVRYNIVHSLPPDINDTEPVTMIFMGYQQAEDSEEDKKFLT 
GYDGIIHAELWIDDEEEEDEGEAEKPSYHPIAPHSQVYQPAKP 
TPLPRKRSEASPHEKHKS 


5461 


3 


1422 


NSPGSVCLCQCVCPSLLHCLPPLLLLLLLPLLLHESPQPPALRV 
VATSSDRNFMNKHQKPVLTGQRFKTRKRDEKEKFEPTVFRDTLV 
QGLNEAGDDLEAVAKFLDSTGSRLDYRRYADTLFDILVAGSMLA 
PGGTRIDDGDKTKMTNHCVFSANEDHETIRNYAQVFNKLIRRYK 
YLEKAFEDEMKKLLLFLKAFSETEQTKLAMLSGILLGNGTLPAT 
I LTS LFTDS LVKEG I AAS FAVKLFKAWMAEKDANS VTS S LRKAN 
LDKRLLELFPVNRQSVDHFAKYFTDAGLKELSDFLRVQQSLGTR 
KELQKELQERLSQECPIKEWLYVKEEMKRNDLPETAVIGLLW? 
C I MNAV E WN KKEELVAE QALKHLKQ Y AP LLAVFS SQGQSELILL 
QKVQEYCYDNIHFMKAFQKIWLFYKADVLSEEAILKWYKEAHV 
AKGKSVFLDQMKKFVEWLQNAEEESESEGEEN 


5482 


1492 


528 


THWMTGMCYAPHQVLSYINGVTTSKPGVSLVYSMPSRNLSLRL " 
EGLQEKDSGPYSCS VNVQDKQGKS RGHS I KTLELNVLVP PAP PS 
CRLQGVPHVGANVTLSCQSPRSKPAVQYQWDRQLPSFQTFFAPA 
LDVIRGSLSLTNLSSSmG VYVCKAHNEVGTAQCNVTLEVSTGP 
GAAWAGAWGTLVGLGLLAGLVLLYHRRG KALE EP AND I KEDA 
IAPRTLPWPKSSDTISKNGTLSSVTSARALRPPHGPPRPGALTP 
TPSLSSQALPSPRLPTTDGAHPQPISPIPGGVSSSGLSRMGAVP 
VMVPAQSQAGSLV 


£483 " 


1 


788 


FFFFKGCRAGRGNESDYRKLEEMHQRFLVSERSKDDLQLRLTRA " 

ENRIKQLETDSSEEISRYQEMIQKLQNVLBSERENCGLVSEQRL 

KLQQENKQLRKETESLRKIALEAQKKAKVKISTMEHEFSIKERG 

FEVQLREMEDSNRNSIVELRHLLATQQKAANRWKEETKKLTESA 

EIRINNLKSELSRQKLHTQELLSQLEMANEKVAENEKLILEHQE 

KANRLQRRL S Q AE ERAAS ASQQLS V I TVQRR KAAS LMNLEN I 


5484 


3 


1997 


IMADMEDLFGSDADSEAERKDSDSGSDSDSDQENAASGSNASGS ~ 

ESDQDERGD3GQPSNKELFGDDSEDEGASHHSGSDNHSERSDNR 

SEASERSDHEDNDPSDVDQHSGSEAPNDDEDEGHRSDGGSHHSE 

AEGSEKAHSDDEKWGRBDKSDQSDDEKIQNSDDEERAQGSDEDK 

LQNSDDDEKMQNTDDEERPQLSDDERQQLSEEEKANSDDERPVA 

SDNDDEKQNSDDEEQPQLSDEEKMQNSDDERPQASDEEHRHSDD 

EEEQDHKSES&RGSDSEDEVLRMKRKNATASDSEADSDTEVPKD 

NSGTMDLFGGADDISSGSDGEDKPPTPGQPVDENGLPQDQQEEE 

P I P E T R I EVE I P KVNTDLGNDL Y F VKLPNFLS VE PR P FD PQ Y YE 

DEFEDEEMLDEEGRTRLKLKVENTIRWRIRRDEEGNEIKESNAR 

IVKWSDGSMSLHLGNEVFDVYKAPLQGDHNHLFIRQGTGLQGQA 

VFKTKLTFRPHSTDSATHRKMTLSIiADRCSKTQKIRXLPMAGRD 

PECQRTEMIKKEEERLRAS I RRESQQRRMREKQHQRGLSAS YLE 

PDRYDEEEEGEESISLAAIKNRYKGGIREERARIYSSDSDEGSE 

EDKAQRLLKAKKLTSDEVRPNLFNSRGLSCTQEPTALNEELTDQ 

AGTN 


5485 


161 


1074 


KRKI L S SMMDS EAHEKRP P I LTS S KQD I S PH ITNVGEMKH YLCG 
CCAAFNNVAI TFP I QKVL FRQQLYG I KTRDAI LQLRRDG FRNL Y 
RGILPPLMQKTTTLALMFGLYEDLSCLLHKHVSAPEFATSGVAA 
VLAGTTEAIFTPLERVQTLLQDHKHHDKFTNTYQAFKALKCHGI 
GE YYRGLVPI LFRNGLSNVLFFGLRGP I KEHLPTATTHSAHLVN 
DFICGGLLGAMLGFLFFPINWKTRIQSQIGGEFQSFPKVFQKI 
WLERDR KL I N L FRGAHLN YHR S L ISWG 1 1 NAT YE FLLKV I 


5486 


1404 


142 


I PGSTI S WSPAAARGLSVCRCCRLHPASAMDLFGDLPEPERS PR 
P AAG KE AQ KG P LL FDDLP PAS S TDSG SGG P LLFDD L P PAS S GDS 
G S LATS I S QMVKTEG KGAKRKTS EEEKNGS EELVEKKVCKAS S V 
IFGLKGYVAERKGEREEMQDAHVILNDITEECRPPSSLITRVSY 
FAVFDGHGGI RAS KFAAQNLHQNLIRKFPKGDV I S VEKTVKRCL 
LDTFKHTDEE FL KQAS S QKPAWKDGSTATCVLAVDNI L Y IANLG 
D S RA I LCR YNE E S QKHAALS LS KEHN PTQ Y E ERMR I QKAGGNVR 
DGRVLGVLEVSRS IGDGQYKRCGVTSVPD I RRCQLT PNDRF I LL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, Fs Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ACDGLFKVFTPEEftVNFILSCLEDEKIQTREGKSAADARYEAAC 
NRLANKAVQRGS ADNVTVMWR IGH 


5487 


535 


182 


AVSLEQIRGLQTPAPVPLPLQPCPSNCDMERVTLALLLLAGLTA 
LEANDPFANKDDPFYYDWKNLQLSGLICGGLLAIAGIAAVLSGK 
CKCKSSQKQHSPVPEKAIPLITPGSATTC 


5488 


1072 


259 


AMAASGEPQRQWQEEVAAWWGSCMTDLVSLTSRLPKTGETIH 
GHKFFIGFGGKQANQCVOAARLGAMTSMVCKVGKDSFGNDYIEN 
LKQNDISTE FTYQTKDAATGTAS 1 1 VNNEGQNI I VIVAGANLLL 
NTEDLRAAANVISRAKVMVCQLEITPATSLEALTMARRSGVKTL 
FNPAPAIADLDPQFYTLSDVFCCNESEAEILTGLTVGSAADAGE 
AALVLLKRGCQ WI ITLGAEGCWLSQTEPE PKHI PTEKVKAVD 
TTVSFKI 


5489 


81 


893 


GKG P VAAF I DQSN I FLTDP K I FLGQWRE E P KM PLLLLGETE PLK ~~ 
L ERD CR S P VE PWAAAS P DLALACLCH CQDL S SG AFPNRG VLGG V 
LFPTVEMVIKVFVATSSGSIAIRKKQQEVVGFLEANKIDFKELD 
IAGDEDNRRWMRENVPGEXKPQNGIPLPPQIFNEEQYCGDFDSF 
FS AKEENI I YS FLGLAPP PDS KGSEKAEEGGE TEAQKEGS EDVG 

NLPEAQEKNEEEGETATEETEEIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5490 


81 


893 


G KG PVAAF I DQS N I FLTDP KI FLGQWREEPKMPLLLLGETEPLK 
LERDCRSPVEPWAAASPDLALACLCHCQDLSSGAFPNRGVLGGV 
LFPTVEMVIKVFVATSSGSIAIRKKQQEWGFLEANKIDFKELD 
IAGDEDNRRWMRENVPGEKKPQNGIPLPPQIFNEEQYCGDFDSF 
FSAKEENIIYSFLGLAPPPDSKGSEKAEEGGETEAQKEGSEDVG 
NLPEAQEKNEEEGETATEETEBIAMEGAEGEAEEEEETAEGEEP 
GEDEDS 


5491 


204 


1194 


GSAPRLSLGPTGAQARDPDWWARPPSRPYTQSKEDRPDTEGRSE 
QGDMASSFLPAGAITGDSGGELSSGDDSGEVEFPHSPEIEETSC 
LAE LFE KAAAHLQ GL I QVASRE QLL YL Y AR YKQ VKVGNCNT P KP 
SFFDFEGKQKWEAWKALGDSSPSQAMQEYIAWKKLDPGWNPQI 
PEKKGKEANTGFGGPVISSLYHEETI REEDKNI FDYCRENN IDH 
ITKAlKSKNVDVNVKDEEGRALIiHWACDRGHKELVTVLLQHRAD 
INCQDNEGQTALHYASACEFLDIVELLLQSGADPTLRDQDGCLP 
EEVTGCKTVSLVLQRHTTGKA 


5492 


3 


1896 


ASKNPLSAVCTTGIMSSLAVRDPAMDRSLRSVFVGNIPYEATEE 
QLKDIFSEVGSWSFRLVYDRETGKPKGYGFCEYQDQETALSAM 
RNLNGRE FS GRALRVDNAAS E KN KEEL KS LG P AAP IIDSPYGDP 
I DPEDAPES I TRAVASLPPEQMFELMKQMKLC VQNSHQEARNML 
LQNPQLAYALLQAQWMRXMDPEIALKILHRKIHVTPLIPGKSQ 
SVSVSGPGPGPGPGLCPGPNVLLNQQNPPAPQPQHLARRPVKDI 
PPLMQTPIQGGIPAPGPIPAAVPGAGPGSLTPGGAMQPQLGMPG 
VGPVPLERGQVQMSDPRAPIPRGPVTPGGLPPRGLLGDAPNDPR 
GGTLLS VTGE VE PRG YLGP PHQGP PMHHAS GHDTRGP S SHEMRG 
GPLGDPRLLIGEPRGPMIDQRGLPMDGRGGRDSRAMETRAMETE 
VLETRVMERRGMETCAMETRGMEARGMDARGLEMRGPVPSSRGP 
MTGGIQGPGPINIGAGGPPQGPRQVPGISGVGNPGAGMQGTGIQ 
GTGMQGAG I QGGGMQGAG IQGVS I QGGG I QGGGI QGAS KQGGSQ 
PSSFSPGQSQVTPQDQEKAALIMQVIiQLTADQIAMLPPEQRQSI 
LILKEQIQKSTGAS 


5493 


1 


1876 


RAPMMTKAVPEEPRKPGRLTQALNSPLTWEHVWICVPGGTPDCL ' 

TDT FR VKR PHL RRSASNGH VPG T P VYR E KEDM YDE 1 1 E LKKS LH 

VQKSDVDLMRTKLRRLEEENSRKDRQIEQLLDPSRGTDFVRTLA 

EKRPDASWVINGLKQRILKLEQQCKEKDGTISKLQTDMKTTNLE 

EMRIAMETYYEEVHRLQTLtiASSETTGKKPLGEKKTGAKRQKKM 

GSALLSLSRSVQELTEENQSLKEDLDRVLSTSPTISKTQGYVEW 

S K PRL LRR I VELE KKL S VME S S KS HAAE PVR S H P P AC LAS S SAL 

HRQPRGDRNKDHERLRGAVRDLKEERTALQEQLLQRDLEVKQLL 

QAKADLEKELECAREGEEERREREEVLREEIQTLTSKLQELQEM 

KKEEKEDCPEVPHKAQELPAPTPSSRHCEQDWPPDSSEEGLPRP 

RSPCS DGR RDAAAR VLQAQ WKVYKHKKKKAVLDEAA WLQAAFR 
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Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, KaLysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T= Threonine, V«Valine, 
W= Tryptophan, Y«Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








GHLTRTKLLASKAHGSEPPSVPGLPDQSSPVPRVPSPIAQATGS 
PVQEEAIVIIQSALRAHLARARHSATGKRTTTAASTRRRSASAT 
HGDASSPPFLAALPDPSPSGPQAVAPLPGDDVNSDDSDDIVIAP 
SLPTKNFPV 


5494 


71 


536 


RSKAXIGTPTREVPSTDMKVRRESSSSLTHRPAPSPATPRLLGT 
RRVLLGVSEGTGCADAMELVLVFLCSLLAPMVLASAAEKEKEMD 
P FHYDYQTLR IGGLVFAWLFS VGI LLILSRRCKCS FNQKPRAP 
GDEEAQVENLITANATEPQKAEN 


5495 


273 


2168 


DSLLLIQVDTMPFTLHLRSRLPSAIRSLILQKKPNIRNTSSMAG 
ELRPASLWLPRSLAPAFERFCQVNTGPLPLLGQSEPBKWMLPP 
QGAISETRMGHPQFWKYEFGACTGSLASLEQYSEQLKDMVAFFL 
GCSFSLEEALEKAGLPRRDPAGHSQAGAYKTTVPCVTHAGFCCP 
LWTMR PIP KDKLEGLVRACCS LGGEQGQ P VHMGDPELLG I KEL 
S KPAYGDAMVCPPGEVP VFWPS PLTSLGAVS S CETPLAFAS I PG 
CTVMTDLKDAKAPPGCLTPERIPEVHHISQDPLHYSIASVSASQ 
KIRELESMIGIDPGNRGIGHLLCKDELLKASLSLSHARSVLITT 
G F P TH FNHE P PE ETDGP PGAVAL VA FL QALE KE VA 1 1 VDQRAWN 
LHQKIVEDAVEQGVLKTQIPILTYQGGSVEAAQAFLCKNGDPQT 
PRFDHLVAI ERAGRAADGNYYNAR KMN I KHLVD P I DDLFLAAKK 
IPGISSTGVGDGGNELGMGKVKEAVRRHIRHGDVIACDVEADFA 
VIAGVSNWGGYALACALYILYSCAVHSQYLRKAVGPSRAPGDQA 
WTQALPSVIKEEKMLGILVQHKVRSGVSGIVGMEVDGLPFHNTH 
AEM I QKI/VD VTTAQV 


5496 


3 


2408 


QDT KMHE I Y KGN I T PQLNKN T LKT S AATDVW A VY FS QF W I D YEG " 

MKSGKGRPISFVDSFPLSIWICQPTRYAESQKEPQTCNQVSLNT 

SQSESSDLAGRLKRKKLLKEYYSTESEPLTNGGQKPSSSDTFFR 

FSPSSSEADIHLLVHVHKHVSMQINHYQYLLLLFLHESLILLSE 

NLRKDVEAVTG S PAS QTS I C I G I LtiRS AE LALLLH P VDQANTLK 

SPVSESVSPWPDYLPTENGDFLSSKRKQISRDINRIRSVTVNH 

MSDNRSMSVDLSHIPLKDPLLFKSASDTNLQKGISFMDYLSDKH 

LGKISEDESSGLVYKSGSGEIGSETSDKKDSFYTDSSSVLNYRE 

DSN I I*S FDS DGNQN ILSS TL TS KGNET I ES I FKAEDLLPEAAS L 

SENLDISXEETPPVRTLKSQSSLSGKPKERCPPNLAPLCVSYKN 

MKRSSSQMSLDTISLDSMILEEQLLESDGSDSHMFLEKGNKKNS 

TTNYRGTAESVNAGANLQNYGETS PDAI STNSEGAQENHDDLMS 

VWFKITGVNGEIDIRGEDTEICIiQVNQVTPDQLGNISLRHYLC 

NRPVGSDQKAVIHSKSSPEISLRFESGPGAVIHSLLAEKNGFLQ 

CH I KKFS TE FLTSS LMN I QH FLEDE T VAT VMPM KIQVS NTK I NL 

KDDSPRSSTVSLEPAPVTVHIDHLWERSDDGSFHIRDSHMLNT 

GNDLKENVKSDSVLLTSGKYDLKKQRSVTQATQTSPGVPWPSQS 

ANFPEFSFDFTREQLMEENESLKQELAKAKMALAEAHLEKDALL 

HHIKKMTVE 


5497 


1821 


3308 


SISKLLKRRSNIDAYLLSNSCAFFAPRLFSIiASQIIREQQ^PNV 
CFIYKYSGFPSLECQCHFVSPHSSCYINFFSFPPPFFVCFQLSN 
GFSHYSLSSESHVGPTGAGLFPHCLPASRLLPRVTSVHLPDYAH 
YYTIGPGMFPSSQIPSWKDWAKPGPYDQPIiVNTLQRRKEKREPD 
PNGGGPTTASGPPAAAEEAQRPRSMTVSAATRPGEEMEACEELA 
LALSRGLQLDTQRSSRDSLQCSSGYSTQTTTPCCSEDTIPSQVS 
DYDYFSVSGDQEADQQEFDKSSTIPRNSDISQSYRRMFQAKRPA 
STAGLPTTLGPAMVTPGVATIRRTPSTKPSVRRGTIGAGPIPIK 
TPVIPVKTPTVPDLPGVLPAPPDGPEERGEHSPESPSVGEGPQG 
VTSMPSSMWSGQASVNPPLPGPKPSIPEEHRQAIPESEAEDQER 
EPPSATVSPGQIPESDPADLSPRDTPQGEDMLNAIRRGVKIiKKT 
TTNDRSAPRFS 


5498 " 


2434 


1492 


HiTHQEIFTGEKPCECGKASIQMSHLSQQKIYSGENPFACKVCG 
KVFSHKSNLTEHEHFHTREKPFECNECGKAFSQKQYVIKHQNTH 
TGEKLFECNECGKSFSQKENLLTHQKIHTGEKPFECKDCGKAFI 
Q KSNL I RHQRTHTGE KP FVCKE CGKT FS GKSNLTEH E K I H I GEK 
PFKCSECGTAFGQKKYLIKHQNIHTGEKPYECNECGKAFSQRTS 
LIVHVRIHSGDKPYECNVCGKAFSQSSSLTVHVRSHTGEKPYGC 
NE CG KAFS Q FS TLALHLR I HTGKKP YQCS E CG KAFSQ KSHH I RH 
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rtiuj.nu aLia scyiueni. containing signal peptide 
(A=Alanine„ C=sCvsteinp» n-A«;narHr s P <; ^ p_ 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine , N»Asparagine, 
P-Proline, Q^Glutamine, R=Arginine, 
S»Serine, T= Threonine, V= Valine, 
W^Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QKIHTH 


5499 


324 


926 


GFGQ IGRGHKITTYPFS PRKSGRKGMAQSQGWVKRYI KAFCKGF 
FVAVPVAVT FLDRVACVAR VEGAS MQPSLNPGGS QSSDWLLNH 
WKVRNFEVHRGDIVSLVSPKNPEQKIIKRVIALEGDIVRTIGHK 
NR YVKVPRGH I W VEGDHHGHSFDSNS FGPVSLGLLHAHATHI LW 
PPERWQKLESVLPPERLPVQREEE 


5500 


1978 


1286 


KPDWRLQNLPPRLYLWRSSRFGFGHLKKRLQMDFKIEHTWDGFP " 

VKHEPVFIRLNPGDRGVMMDISAPFFRDPPAPLGEPGKPFNELW 

DYEVVEAFFL^DITEOYLEVETirPHRnTTTiVT.TJ.QriWR'Kn^Kiv-nc'T 

PLSFRVSRGETKWEGKAYLPWSYFPPNVTKFNSFAIHGSKDKRS 
YEALYPVPQHELQQGQKPDFHCLEYFKSFNFNTLLGEEWKQPES 
DLWLIEKCDI 


5501 


2927 


222S 


CRPPVSARVAPGHQGAVGGSGRRPARVEWDAAARPSSRPFSLP 
AAIMLALISRLLDWFRSLFWKBEMELTLVGLQYSGKTTFVNVIA 
SGQFSEDMIPTVGFNMRKVTKGNVTIKIWDIGGQPRFRSMWERY 

NKRDLPNALDEKQL I EKMNLSAI QDRE I CCYS I S CKEKDNI D I T 
LQWLIQHS KSRRS 


5502 


3 


824 


NSAFPVWVPERTALliTCPLGAAPGSSREAPGlAGPPNSTAMSKL 

vjiu? r ixu^^boJvbKAAyoFUiWUjVKLRETEEMLGK 

RE I ALAKKHGTQNKRAALQ ALKRKKR FE KQLTQ I DGTLS T I E FQ 

REALENSHTNTEVLRNMGFAAKAMKS VHENMDLNKIDDLMQE I T 

EQQDIAQEISEAFSQRVGFGDDFDEDELMAELEELEQEELNKKM 

TNIRLPNVPSSSLPAQPNRKPGMSSTARRSRAASSQRAEEEDDD 

IKQLAAWAT 


5503 


216 . 


654 


KGVRRRGRVRSDSEDSHLGYFKMSFLLPKLTSKKEVDQAIKSTA 
EKVLVLRFGRDEDPVCLQLDDILSKTSSDLSKMAAIYLVDVDQT 
AVYTQYFDISYIPSTVFFFNGQHMKVDYGGEDPALRSIKAVRRT 
SPAGTLGEKPVNS 


"£504 


58 


3563 


QLSFSFQAPVTFDDITVYLLQEEWVLLSQQQKELCGSNKLVAPL 
GPTVANPELFRKFGRGPEPWLGSVQGQRSLLEHHPGKKQMGYMG 
EMEVQGPTRESGQSIiPPQKKAYLSHLSTGSGHIEGDWAGRNRKL 
LKPRS IQKS WFVQFPWL IMNEEQTALFCSACREYPS IRDKRSRL 
I EGYTG P FKVETLKYHAKS KAHMFC VNALAARDP I WAARFRS I R 
DPPGDVliASPEPLFTADCPIFYPPGPLGGFDSMAELLPSSRAEL 
BVPGGDGAIPAMYLDCISDLRQKEXTDGXH3SSDINXLYNDAVE 
SCI QDP SAEGLSE EVPWFEELP WFEDVAVYFTREE WGMLDKR 
Q KE L YR D VMRMN YE LLAS LG P AAAKPDL I S KLERRAAPW I KD PN 
GPKWGKGR PPGNKKMVAVREAPTQASAADSALLPGS PVEARASC 
CSSSICEEGDGPRRIKRTYRPRSIQRSWFGQFPWLVIDPKETKL 
FCSACIERPWLHDKSSRLVRGYTGPFKVETLKYHEVSKAHRLCV 
NTVEIKEDTPHTALVPEISSDLMANMEHFFNAAYSIAYHSRPLN 
DFEKILQLLQSTGTVILGKYRNRTACTQFIKYISETLKREILED 
VRNSPCVSVLLDSSTDASEQACVGIYIRYFKQMEVKESYITLAP 
LYSETADGYFETIVSALDELDIPFRKPGWWGLGTDGSAMLSCR 
GGLVE KFQE VI PQLLPVHCVAHRIiHLAWDACGS I DLVKKCDRH 

RRTLHALLVS W PALARHLQRVAE AGGQ IGHRAKGMLKLMRGFHF 
VKFCHFLLDFLSIYRPLSEVCQKEIVLITEVNATLGRAYVALES 
LRHQAGP KEEEFNAS FKDGRLHG I CLDKLE VAEQRFQADRERTV 
LTGIEYLQQRFDADRPPQLKNMEVFDTMAWPSGIELASFGNDDI 
LNLARYFBCSLPTGYSEEALLBEWLGLKTIAQHLPFSMLCKNAL 
AQHCR FPLL S KLMA VWCVP I S TS CCERG FKAMNR I R TDERTKL 
SNEVLNMLMMTAVNGVAVTEYDPQPAIQHWYLTSSGRRFSHVYT 
CAQVPARSPASARLRKEEMGALYVEEPRTQKPPILPSREAAEVL 
KDCIMSPPERLLYPHTSQEAPGMS 


5505 


3312 


1219 


NCSPRSI^AAKMSNRNNNKLPSNLPQLQNLIKRDPPAYIEEFLQ 
Q YNH YXSNVE I FKLQPNKPS KELAEL VMFMAQ I SHC YPE YLSNF 
PQEVKDLLSCNHTVLDPDLR>TrFCiUVLILIiRNKNLINPSSIjLEL 
FFELFRCHDKLLRKTLYTHIVTDIKNINAKHKNNKVNVVLQNFM 
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(A=Alanine, C«Cysteine, D=Aspartic Acid, E=» 
Glutamic Acid, ^Phenylalanine, G^Glycine, 
H=Histidine, I^Isoleucine, K-Lysine, 
L-Leucine, M«Methionine, N^Asparagine, 
P- Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








YTMLRDSNATAAKMS LDVMI EL YRRNI WNDAKTVNV I TTACFS K 
VTKILVAALTFFLGKDEDEKQDSDSESEDDGPTARDLLVQYATG 
KKSSKNKKKLEKAMKVLKKHRKKKKPEVFNFSAIHLIHDPQDFA 
EKLLKQLECCKERFEVKMMLMNLISRLVGIHELFLFNFYPFLQR 
FLQPHQREVTKILLFAAQASHHIiVPPEIIQSLLMTVANNFVTDK 
NSGEVMTVGINAIKEITARCPLAMTEELLQDLAQYKTHKDKNVM 
MSARTLXHLFRTLNPQMLQKKFRGKPTEASIEARVQEYGELDAK 
DYIPGAEVLEVEKEENAENDEDGWESTSLSEEEDADGEWIDVQH 
SSDEEQQEISKKLNSMPMEERKAKAAAISTSRVLTQEDFQKIRM 
AQMRKELDAAPGKSQKRKYIEIDSDEEPRGELLSLRDIERLHKK 
PKSDKETRLATAMAGKTDRECEFVRKKTKTNPFSSSTNKEKKKQK 
NFMMMRYSQNVRSKNKRSFREKQLALRDALLKKKKRMK 


5506 


1 


1531 


FRGDLCGQRGGSAPGEGGSSAWPAPAHPLPEREREREALCPGRS 
CSGGGGEETPGTTPVWSPLEGGGDEELRPNPYVRFPYRWWAVVV 
LAAFPSLGAGGETPEAPPESWTQLWFFRFWNAAGYASFMVPGY 
LLVQYFRRKNYLETGRGLCFPLVKACVFGNEPKASDEVPLAPRT 
EAAETTPMWQALKLLFCATGLQVSYLTWGVLQERVMTRSYGATA 
TSPGERFTDSQFLVLMNRVLALIVAGLSCVLCKQPRHGAPMYRY 
SFASLSNVLSSWCQYEALKFVSFPTQVIiAKASKVIPVMLMGKLV 
SRRSYEHWEYLTATLISIGVSMFLLSSGPEPRSSPATTLSGLIL 
LAG Y I AFDS FTS N WQDALFA YKMS S VQMMFG VNFFS CL FTVGSL 
LEQGALLEGTRFMGRHS EFAAHALLLS ICS ACGQLFI FYT IGQF 
GAAVFT 1 1 MTLRQAFAI LLS CLLYGHTVTVVGGLG VAWFAALL 
LRV YARGRLKQRG KKAVPVES PVQKV 


5507 


3704 


1271 


PRGTRRCRPAGRASRRARRRPPCPGPAAPGSLEIGGFGTAAGKK 
VAVADVQFK3PMRFHQDQLQVLLVFTKEDNQCNGFCRACEKAGFK 
CTVTKEAQAVLACFLDKHHDI I I IDHRNPRQLDAEALCRS I RSS 
KLS ENT VI VG WRR VDRE E LS VM P F I S AGFTRR Y VENPN I MAC Y 
NELLQLEFGEVRSQLKLRACNSVFTALENSEDAIEITSEDRFIQ 
YANPAFETTMGYQSGELIGKELGEVPINEKKADLLDTINSCIRI 
GKEWQG I YYAKKKNGDN I QQNVKI I PVIGQGGK I RHYVS I IRVC 
NGNNKAE KI S E CVQSDTHTDNQTGKHKDRRKGS LDVKAVAS RAT 
EVSSQRRHSSMARIHSMTIEAPITKVINIINAAQESSPMPVTEA 
LDRVLEILRTTELYSPQFGAKDDDPHANDLVGGLMSDGLRRLSG 
NE YVLS T KNTQMVS SNIITPIS LDD VP PRI ARAM ENEE YWD FD I 
FELE AATHNR PL I YLGLKM FARFG I CE FLHCS E S TLRS WLQ HE 
ANYHSSNP YHNS THSADVLHATAYFLS KER I KETLDPIDE VAAL 
IAATIHDVDHPGRTNSFLCNAGSELAILYNDTAVLESHHAALAF 
QLTTGDDKCNI FKNMERND YRTLRQG I IDMVLATEMTKHFEHVN 
KFVNSINKPLATLEENGETDKNQEVINTMLRTPENRTLIKRMLI 
KCADVSNPCRPLQYCIEWAARISEEYFSQTDEEKQQGLPWMPV 
FDRNTCS I PKSQ I S FI D YF I TDMFDAWDAF VDL PDLMQHLDNNF 
KYWKGLDEMKLRNLRPPPE 


5508 


1151 


691 


LSSVFSRRSASMFAVGCSMGPFLHYWYLSLDRLFPASGLRGFPN 
VLKKVLVDQLVASPLLGVWYFLGLGCLEGQTVGESCQELREKFW 
EFYKADWCVWPAAQFVNFLFVPPQFRVTYINGLTLGWDTYLSYL 
KYRSPVPLTPPGCVALDTRAD 


5509 


1238 


619 


RKSRGCQNALSASGPAAAAAAIMVRKLKFHEQKLLKQVDFLNWE 
VTDHNLHELRVLRRYRLQRREDYTRYNQLSRAVRELARRLRDLP 
ERDQFRVRASAALLDKLYALGLVPTRGSLELCDFVTASSFCRRR 
LPTVLLKLRMAQHLQAAVAFVEQGHVRVGPDWTDPAFLVTRSM 
EDFVTWVDSSKIKRHVLEYNEERDDFDLEA 


5510 


96 


1195 


PAGAHL SSGSS E PL VE PGRGR VG AR VKGERG LQAS G S APGRS KM 
AEGERQPPPDSSEEAPPATQNFIIPKKEIHTVPDMGKWKRSQAY 
ADYIGF I LTLNEGVKGKKLTFE YRVS EAI EKLVALLNTLDRW ID 
ETP P VDQ PS R FGNKA YRTWYAKLDEEAENL VATWP THLAAAVP 
E VAVYLKES VGNS TR I D YGTGHEAAFAAFLCCLCKIGVLRVDDQ 
I A I VFKVFNR YLEVMRKLQKTYRMEPAGSQG VWGLDD FQFLPFI 
WGS S QL I DH P YLE PRH F VDE KAVNENHKD YM FLEC I LF I TEM KT 
GPFAEHSNQLWNISAVPSWSKVNQGLIRMYKAECLEKFPVIQHF 
KFGSLLPIHPVTSG 
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{A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q»Glutamine , R=Arginine, 
S«Serine, T«Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


5511 


276 


1980 


KLSRVLNLPPENLITSISAVPISQKEEVADFQLSVDSLLEKDND 
HSRPDIQVQAKRLAEKLRCDTWSEISTGQRTVNFKINRELLTK 
TVLQQ V I E DGS K YG L KS EL FS GLPQKKI WE FS S PNVAKKFH VG 
HLRSTIIGNFIANLKEALGHQVIRINYLGDWGMQFGLLGTGFQL 
FGYEEKLQSNPLQHLFEVYVQVNKEAADDKSVAKAAQEFFQRLE 
LGDVQALSLWQKFRDLSIEEYIRVYKRLGVYFDEYSGESFYREK 
SQEVhKLLESKGLLhKTIKGTAWDLSGNGDPSSICTVMRSDGT 
SLYATRDLAAAIDRMDKYNFDTMIYVTDKGQKKHFQQVFQMLKI 
MGYDWAERCQHVPFGWQGMKTRRGDVTFLEDVLNEIQLRMLQN 
MASIKTTKELKNPQETAERVGLAALIIQDFKGLLLSDYKFSWDR 
VFQSRGDTGVFLQYTHARLHSLEETFGCGYLNDFNTACLQEPQS 
VSILQHLLRFDEVLYKSSQDFQPRHIVSYLLTLSHLAAVAHKTL 
QIKDSPPEVAGARLHLFKAVRSVLANGMKLLGITPVCRM 


5512 


120 


1015 


DPSLLLTITVTGVTVLVLVLKSMNSRRREPITLQDPEAKYPLPL 
IEKEKISHNTRRFRFGLPSPDHVLGLPVGNYVQLLAKIDNELW 
RAYTPVSSDDDRGFVDLIIKIYFKNVHPQYPEGGKMTQYLENMK 
IGET I FFRG PRGRLFYHGPGNLG IRPDQTSEPKKTLADHLGMI A 
GGTGITPMLQLIRHITKDPSDRTRMSLIFANQTEEDILVRKELE 
BIARTHPDQFDLWYTLDRPPIGWKYSSGFVTADMIKEHLPPPAK 
STLI LVCG P P PL IQTAAHPNLEKLG YTQDM I FTY 


5513 


2 


837 


ARWRLPSDSPRIPPAGAETPGRGSCRNYLPSSSPPPPEPSSFPS 
P PTSRGG PGS RDTMS DS E EESQDRQLKI WLGDGASG KTSLTTC 
FAQETFGKQYKQTIGLDFFLRRITLPGNLNVTLQIWDIGGQTIG 
GKMLDKYIYGAQGVLLVYDITNYQSFENLEDWYTWKKVSEESE 
TQ P L VALVGN K I DLEHMRT I KPE KHLR FCQENG FS SH F VS AKTG 

DSVFLCFQKVAAEILGIKLNKAEIEQSQRWKADIVNYNQEPMS 
RTVNPPRSSMCAVQ 


5514 


1295 


449 


VNRPSWIMGNFRGHALPGTFFFIIGLWWCTKSILKYICKKQKRT 
CYLGSKTLFYRLEILEGITIVGMALTGMAGEQFIPGGPHLMLYD 
YKQGHWNQLLGWHHFTMYFFFGLLGVADILCFTISSLPVSLTKL 
MLSNALFVEAF I FYNHTHGREMLDI FVHQLLVL WFLTGLVAFL 
EFLVRNNVLLELLRSSLILLQGSWFFQIGFVLYPPSGGPAWDLM 
DHENILFLTICFCWHYAVTIVIVGMNYAFITWLVKSRLKRLCSS 
EVGLLKNAEREQESEEEM 


5515 


1572 


260 


FVRLVGRGDCDPLLSVCLTTMPLYEGLGSGGEkTAwlDLGEAF" 
TKCG FAG ETG P RC 1 1 PS VI KRAGMPKP VRWQYN INTEELYS YL 
KE F I H I L YFRHLL VNPRDRR WI IESVLCPSHFRETLTRVLFKY 
FE VP S VLLAPSHLMALLTLG INS AMVLDCGYRES LVL P I YEG I P 
\HjNCWGALPLGGKAIjHKELETQIJjEQCTVDTSVAKEQSLPSVMG 
SVPEGVLEDIKARTCFVSDLKRGLKIQAAICF'NIDGNNERPSPPP 
NVDYPLDGEKILHILGSIRDSWEILFEQDNEEQSVATLILDSL 
I QCP I DTRKQLAENL WI GGTS MLPGFLHRLLAE I RYLVE KPKY 
KKALGTKTFRIHTPPAKANCVAWLGGAIFGALQDILGSRSVSKE 
YYNQTGRIPDWCSLNNPPLEMMFDVGKTQPPLMKRAFSTEK 


5516 


3 


735 


NSREPPQAGPGPSPRKSPTASSFLFPWRPLASSFWMGAQGAQES 
I KAMWRVPGTTRRPVTGES PGMHRPEAMIjLLLTIiALLGGPTWAG 
KMYGPGGGKYFSTTEDYDHEITGLRVSVGLLLVKSVQVKLGDSW 
DVKLGALGGNTQEVTLQPGEYITKVFVAFQAFLRGMVMYTSKDR 
Y F YFG KL DGQ I S SAY P SQEGQ VL VG I YGQ YQLLG I KS IGFEWNY 
PLEEPTTEPPVNLTYSANSPVGR ■ 


5517 ■ 


246 


499 


SEIYVAMRTDSSKMTDVESGVANFASSARAGRRNALPDIQSSAA ' 
TDGTS DL P LKLE ALS VKEDAKE KDEKTTQDQLEKPQNEE K 


5518 


3 


1375 


DAWADAWVRAWDLNMDFPCLWLGLLLPLVAALDFNYHRQEGMEA " 
FLKTVAQNYSSVTHLHSIGKSVKGRNLWVLWGRFPKEHRIGIP 
E FKYVANMHG DETVGR E LLLHL I D YL VTSDGKDPE I TNL INS TR 
IHIMPSMNPDGFEAVKKPDCYYSIGRENYNQYDLNRNFPDAFEY 
NNV S RQPET VAVMKWLKrETF VLSANLHGGAIj VAS YP FDNGVQA 
TGALYSRSLTPDDDVFQYLAHTYASRNPNMKKGDECKNKMNFPN 
GVTNGYSWYPLQGGMQDYNYIWAQCFEITLELSCCKYPREEKLP 
S FWNNNKAS L I EY I KQVHLG VKGQVFDQNGNPLPNVI VEVQDRK 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spondi ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, Es 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HsHistidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=TyroBine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








HICP YRTNKYGEYYLLLLPGS YI INVTVPGHDPHITKVI IPEKS 
QN FS ALKKD I LLP FQGQLDS I P VSN PS CPM I PL YRNLPDHSAAT 
KPSLFLFLVSLLHIFFK 


5*19 


87 


477 


IKSKLNQQVEVQESEWRLTEAKGPTMGKESGWDSGRAAVAAVVG 

gwavgtvlvalsamgftsvgiaass iaakmmstaai ANGGGVA 

AGSLVAILQSVGAAGLSVTSKVIGGFAGTALGAWLGSPPSS 


| 5520 


117 


943 


PTEGRQKVLKTFTVPRSALAMTKTSTCIYHFLVLSWYTFLNYYI 
SQEGKDEVKPKILANGARWKYMTLLNLLLQTIFYGVTCLDDVLK 
RTKGGKDIKFLTAFRDLLFTTLAFPVSTFVFLAFWILFLYNRDL 
I YP KVLDTV I P VWLNHAMHTF I FP I TLAEWLR PHS YPS KKTGL 
TLLAAAS I AY I SR I LWLYFETGTWVYP VFAKLS LLGLAAFFSLS 
YVF I AS I YLLG E K LNH W KW VS VQ I LQR WRLE S VG I C FQW P D WKS 
PAKHQ LVKN I R 


5521 


546 


911 


KILNMQKSCEENEGKPQNMPKAEEDRPLEDVPQEAEGNPQPSEE 
GVSQEAEGNPRGGPNQPGQGFKEDTPVRHLDPEEMIRGVDELER 
LREEIRRVRNKFVMMHWKQRHSRSRPYPVCFRP 


5522 


1224 


637 


GSRPLGQRSREKMWVFGYGSLIWKVDFPYQDKLVGYITNYSRRF 
WQGSTDHRGVPGKPGRWTLVEDPAGCVWGVAYRLPVGKEEEVK 
AYLDFREKGGYRTTTVIFYPKDPTTKPFSVLLYIGTCDNPDYLG 
PAPLEDIAEQ I FNAAGPSGRNTEYLFELANS IRNLVPEEADEHL 
FALEKLVKERLEGKQNLNCI 


5523 


3 


1280 


S KG KKRMG S S M S AATARR P VFDD KED VNFDH FQ I LRAI G KG S FG 
KV C I VQKRDT E KM YAM KY MNKQQ C I E RDE VRNV FRELE I LQE I E 
HVFLVNLWYS FQDEEDMFMWDLLLGGDLRYHLQQNVQFSEDTV 
RL Y I CEMALALD YLRGQH I I HRDVKPDNI LLD ERGHAHLTD FNI 
ATIIKDGERATALSGTKPYMAPEIFHSFVNGGTGYSFEVDWWSV 
GVMAYELLRGWRPYDIHSSNAVESLVQLFSTVSVQYVPTWSKEM 
VALLRKLLTVWPEHRLSSLQDVQAAPALAGVLWDHLSEKRVEPG 
F VPN KGRLH CD PT FE LE EM I LESR PLHKKKKRLAKNKS RDN S RD 
SSQSENDYLQDCLDAIQQDFVIFNREKLKRSQDLPREPLPAPES 
RDAAEPVEDEAERSALPMCGPICPSAGSG 


<5S24 


85 


2318 


RERERDHRPGESSQGQSGAGGCFPSPTMELRCGGLLFSSRFDSG 
N LAH VE KVE S L S SDG EG VGGGAS ALTS G I AS S PD YE FNVWT R P D 
CAETEFENGNRSWFYFSVRGGMPGKLIKINIMNMNKQSKLYSQG 
MAPFVRTLPTRPRWERIRDRPTFEMTETQFVLSFVHRFVEGRGA 
TTFFAFCYPFSYSDCQELLNQLDQRFPENHPTHSSPLDTIYYHR 
BLLCYSLDGLRVDLLTITSCHGLREDREPRLEQLFPDTSTPRPF 
RFAGKRIFFLSSRVHPGETPSSFVPNGFLDFILRPDDPRAQTLR 
RLFVFKLIPMLNPDGWRGHYRTDSRGVNLNRQYLKPDAVLHPA 
I YGAKAVLLYHHVHSRLNSQS SSEHQ PSSCLPPDAPVSDLE KAN 
NLQNEAQCGHSADRHNAEAWKQTEPAEQKLNSVWIMPQQSAGLE 
ESAPDTIPPKESGVAYYVDLHGHASKRGCFMYGNSFSDESTQVE 
NMLYPKLISLNSAHFDFQGCNFSEKNMYARDRRDGQSKEGSGRV 
A I Y KASG1 IHS YTLE CN YNTGRS VNS I PAACHDNGRAS PP P P PA 
F PS R YT VEL FE Q VGRAMAI AALDMAE CNP W PR I VLS EHS S LTNL 
RAWML KHVRNS RG LS STLNVG VNKKRGLRTPPKSHNGLP VS CSE 
NTLSRARSFSTGTSAGGSSSSQQNSPQMKNSPSFPFHGSRPAGL 
PGLGSSTQKVTHRVLGPVRGKPVWEPLQHVFGCLGHCWGK 


5525 


105 


834 


SNTLDFERHLFIMGQQISDQTQLVINKLPEKVAKHVTLVRESGS 
LTYEE FLGRVAELNDVTAKVASGQEKHLLFE VQPGSDS SAFWKV 
WRWCTKINKS S G I VEASR I MNLYQF IQLYKDITSQAAGVLAQ 
SSTSEE PDENS S S VTS CQAS LWMGRVKQLTDEEECC I CMDGRAD 
L I LP CAHS FCQ KC I D KWSDRHRNCP I CR LQMTGANES W WS DA P 
TEDDMAN Y I LNMADE AGQPHRP 


5526 


3 


853 


RRPCNP VRAAKRTGAAARAPRGLEVTMLR VAWRTLS L I RTRAVT 
QVLVPGLPGGGSAKFPFNQWGLQPRSLLLQAARGYWRKPAQSR 
LDDDP P PS TLLKD YQNVPG I EKVDD WKRLLS LEMANKKEMLK I 
KQEQFMKKIVANPEDTRSLEARIIALSVKIRSYEEHLEKHRKDK 
AHKRYLLMS1DQRKKMLKNLRNTNYDVFEKICWGLGIEYTFPPL 
YYRRAHRRFVTKKALCIRVFQETQKLKKRRRALKAAAAAQKQAK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, Isisoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y«Tyrosine, X=Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








RRNPDSPAKAIPKTLKDSQ 


5527 


3225 


565 


LLRKYLLHQNPLLLRHQP^TCISFSATMKLKDTKSRPKQSSCG ' 

KFQTKGIKWGKWKEVKIDPNMFADGQMDDLVCFEELTDYQLVS 

PAKN PS S L FS KEAPKRKAQAVS EEEEEE EGKS SS PKKKI KLKKS 

KNVATEGTSTQKEFEVKDPELEAQGDDMVCDDPEAGEMTSENLV 

QTAPKKKKNKGKKGLEPSQSTAAKVPKKAKTWIPEVHDQKADVS 

AWKDLFVPRPVLRALS FIX3FSAPTP IQALTLAPAIRDKLDILGA 

AETGSGKTLAFAI PM I HAVLQWQKRNAAPP PSNTEAP PGETRTE 

AGAETRS PGKAEAESDALPDDTVI ES EALPSD I AAEARAKTGGT 

VSDQALLFGDDDAGEGPSSLIREKPVPKQNENEEENLDKEQTGN 

LKQELDDKSATCKAYPKRPLLGLVLTPTRELAVQVKQHIDAVAR 

FTGIKTAILVGGMSTQKQQRMLNRRPEIWATPGRLWELIKEKH 

YHLRNLRQLRCLWDEADRMVEKGHFAELSQLIiEMLNDSQYNPK 

RQTLVFSATLTLVHQAPARILHKKHTKKMDKTAKLDLLMQKIGM 

RG KP KV I DLTRNEATVE TLTETK I HCETDE KDF YL Y Y FLMQ YPG 

RSLVFANS I S C I KRLS GLLKVLD I M PLTLHACMHQKQRLRNLEQ 

FARLEDCVLLATDVAARGLDIPKVQHVIHYQVPRTSEIYVHRSG 

RTARATNEG L S LML I G PED V I N F KK I YKTLKKDE D I P L FP VQTK 

YMD W KE R I RLARQ I E KS E YRN FQACLHNS W I EQ AAAALE I EL E 

EDMYKGGKADQQEERRRQKQMKVLKKELRHLLSQPLFTESQKTK 

YPTQSGKPPLLVSAPSKSESALSCLSKQKKKKTKKPKEPQPEQP 

QPSTSAN 


5528 


3 


895 


GPFLSACRMWGACKVKVHDSLATISITLRRYLRLGATMAKSKFE 
YVRDFEADDTCLAHCWVWRLDGRNFHRFAEKHNFAKPNDSRAL 
QLMTKCAQTVMEELEDIVIAYGQSDEYSFVFKRKTNWFKRRASK 
FMTHVASQFASSYVFYWRDYFEDQPLLYPPGFDGRVWYPSNQT 
LKDYLSWRQADCH1NNLYNTVFWALIQQSGLTPVQAQGRLQGTL 
AADKNE ILFSEFNINYNNEPPMYRKGTVLIWQKVDEVMTKE I KL 
PTEMEGKKMAVTRTRTKPCKPSHLPRAPCLRWL 


5529 


46 


640 


TFRLVSAHLKTRKblNPEAAERRWRDWDSRQGWLSVKMQRVSGL 
LSWTLSRVLWLSGLSEPGAARQPRIMEEKALEVYDLIRTIRDPE 
KPNTLEELEWSESCVEVQEINEEEYLVIIRFTPTVPHCSLATL 
IGLCLRVKLQRCLPFKHKLEIYISEGTHSTEEDINKQINDKERV 
AAAMEN PNLR E I VEQC VLE PD 


5530 


4541 


2606 


AQIVHAISYCHKLHVGHRDLKPENWFFEKQGLVKLTDFGFSNK 
FQPGKKLTTSCGSLAYSAPEILLGDEYDAPAVDIWSLGVILFML 
VCGQPPFQEANDSETLTMIMDCKYTVPSHVSKECKDLITRMLQR 
DPKRRASLEEIENHPWLQGVDPSPATKYNIPLVSYKNLSEEEHN 
S I I QRMVLGD I ADRDAI VE ALETNRYNHITAT YFLLAER I LREK 
QEKEIQTRSASPSNIKAQFRQSWPTKIDVPQDLEDDLTATPLSH 
ATVPQS PARAADSVLNGHRS KGLCDS AKKDDLPE LAG PALS IVP 
PASLKPTASGRKCLFRVEEDEEEDEEDKKPMSLSTQWLRRKPS 
VTNRLTSRKSAPVLNQIFEEGESDDEFDMDENLPPKLSRLKMNI 
ASPGTVHKRYHRRKSQGRGSSCSSSETSDDDSESRRRLDKDSGF 
TYSWHRRDSSEGPPGSEGDGGGQSKPSNASGGVDKASPSENNAG 
GGSPSSGSGGNPTMTSGTTRRCAGPSNSMQLASRSAGELVESLK 
LMSLCLGSQLHGSTKYIIDPQNGLSFSSVKVQEKSTWKMCISST 
GNAGQVPAVGGI KFFSDHMADTTTELER I KSKNLKNNVLQLPLC 
EKTISVNIQRNPKEGLLCASSPASCCHVI 


5531 


24 


515 


GSOPRAPRPRDSMERPEPELIRQSWRAVSRSPLEHGTVLFARLF 
ALEPDLLPLFQYNCRQFSSPEDCLSSPEFLDHIRKVMLVIDAAV 
TNVEDLSSLEEYLASLGRKHRAVGVKLSSFSTVGESLLYMLEKC 
LGPAFTPATRAAWSQLYGAWQAMSRGWDGE 


5532 


3395 


1402 


SDWMWGKRKMIIEDETEFCGEELLHSVLQCKSVFDVLDGEEMR 
RARTRANPYEMIRGVFFLNRAAMKMANMDFVFDRMFTNPRDSYG 
KPLVKDREAELLYFADVCAGPGGFSEYVLWRKKWHAKGFGMTLK 
GPNDFKLEDFYSASSELFBPYYGEGGIDGDGDITRPENISAFRN 
FVLDNTDRKGVHFLMADGGFSVEGQENLQEILSKQLLLCQFLMA 
LSIVRTGGHFICKTFDLFTPFSVGLVYLLYCCFERVCLFKPITS 
RPANSERYWCKGLKVGIDDVRDYLFAVNIKLNQLRNTDSDVNL 
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ID 

WO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M-Methionine / N*«Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion} 








WPLEVIKGDHEFTDYMIRSNESHCSLQIKALAKIHAFVQDTTL 
SEPRQAEIRKECLRLWGIPDQARVAPSSSDPKSKFFELIQGTEI 
DIFSYKPTLLTSKTLEKIRPVFDYRCMVSGSEQKFLIGLGKSQI 
YTWDGRQSDRWIKLDLKTELPRDTLLSVEIVHELKGEGKAQRKI 
SA I H I LD VLVLNGTD VRE QH FNQR I Q LAE KF VKAVS K P S R PDMN 
PIRVKEVYRLEEMEKIFVRLEMKIIKGSSGTPKLSYTGRDDRHF 
VPMGLYIVRTVNEPWTMGFSKSFKKKFFYNKKTKDSTFDLPADS 
IAP FHI C YYGRLF WEWGDG I RVHDSQKPQDQDKLS KEDVLS FIQ 
MHRA 


5533 


94 


789 


MKE RRAPQP VVARCKLVLVGDVQCG KTAMLQVLAKDCYP ETY VP 
TVFEN YTACL ETE EQR VE LS L WDTSG S PYYDNVR P LC YS DS D A V 
LLCFDISRPETVDSALKKWRTEILDYCPSTRVLLIGCKTDLRTD 
LSTLMELSHQKQAPISYEQGCAIAKQLGPE1YLEGSAFTSEKSI 
HSIFRTASMLCLNKPSPLPQKSPVRSLSKRLLHLPSRSELISPT 
FKKE KAKXCS I M 


5534 


3 


605 


LVRGRARAANPGRVGAMDGLRQRVEHFLEQRNLVTEVLGALEAK 
TGVEKRYLAAGAVTLLS LYLLFG YGASLLCNL IGFVYP AYA q T If 
AIESPSKDDDTVWLTYWWYALFGLAEFFSDLIiLSWFPFYYVGK 
CAFLLFCMAPRPWNGALMLYQRWRPLFLRHHGAVDRIMNDLSG 
RALDAAAGITRNVKPSQTPQPKDK 


5535 


1029 


332 


KSFMDSEARLCSLVELSDTQDETQKSDSENEDLKIDCLQESQEL 
nlq klkns e r i lteakq kmre lt vn i km kedl i ke li ktgndak 
SVSKQYTLKVTKLEHDAEQAKVELTETQKQLQEIjENKDLSDVAM 
KVKLQKE FRKKVDAAKLRVQ VLQKKQQDS KKLAS LS IQNE KRAN 
E LEQS VDHMKYQ K I QLQRKLQEEN E KRKQLDAV I KRDQQKI KVI 
LSYIPAKYNMKC 


5536 


942 


282 


AAATAAS LS PRG CRLRT P S S DVS PS RAP P PSAAPL PTGRAQMS P 
S GRLC LLTI VGL I LP TRGQTLKDTTS S S SADAT IMD I QVP TRAP 
DAVYTELQPTSPTPTWPADETPQPQTQTQQLEGTDGPLVTDPET 
HKSTKAAHPTDDTTTLSERPSPSTDVQTDPQTLKPSGFHEDDPF 
F YDEHTLRKRGLLVAAVLFI TG I II LTSGKCRQLS RLCRNHCR 


5537 


3 

» 


2391 


RARVSSPQLRVFRSGRPRRLRVLRINRTSVALRLAGTGRFVAKT 
PGHPGSWEMGLLTFRDVAVEFSLEEWEHLEPAQKNLYQDVMLEN 
YRNLVSLGLWSKPDLITFLEQRKEPWNVKSEETVAIQPDVFSH 
YNKDLLTEHCTEAS FQKVISRRHGSCDLENLHLRKRWKREECEG 
HNGCYDEKTFKYDQFDESSVESLFHQQ I LSSCAKS YNFDQYRKV 
FTHSSLLNQQEEIDIWGKHHIYDKTSVLFRQVSTLNSYRNVFIG 
EKNYHCNNSEKTLNQSSSPKNHQENYFLEKQYKCKEFEEVFLQS 
MHGQEKQEQSYKCNKCVEVCTQSLKHIQHQTIHIRENSYSYNKY 
DKDLSQSSNLRKQIIHNEEKPYKCEKCGDSLNHSLHLTQHQIIP 
TEEKPYKWKECGKVFNLNCSLYLTKQQQIDTGENLYKCKACSKS 
FTRSSNLIVHQRIHTGEKPYKCKECGKAFRCSSYLTKHKRIHTG 
EKPYKCKECGKAFNRSSCLTQHQTTHTGEKLYKCKVCSKSYARS 
SNL I MHQRVHTG E KP YKCKB CG KVF SRS S CLTQHRK I HTGENL Y 
KCKVCAKPFTCFSNLIVHERIHTGEKPYKCKECGKAFPYSSHLI 
RHHR I HTG E KP YKCKACS KS FS DS SG LT VHRRTHTG E K P YTCKE 
CGKAFS YSSDVIOHRR IHTGORPYKCEECGKAFNYP 9 YT.TTHnp 
SHTGERPYKCEECGKAFNSRSYLTTHRRRHTGERPYKCDECGKA 
FS YRS YLTTHRRSHS GERP YKCE E CGKAFNS RS YL I AHQRSHTR 
EKL 


5538 


926 


161 


HSMMMKIPWGSIPVLMLLLLLGLIDISQAQLSCTGPPAIPGIPG 
IPGTPGPDGQPGTPGIKGEKGLPGLAGDHGEFGEKGDPGIPGNP 
GKVGPKGPMGPKGGPGAPGAPGPKGESGDYKATQKIAFSATRTI 
NVPLRRDQTIRFDHVITNMIJNNYEPRSGKFTCKVPGLYYFTYHA 
SSRGNLCVNLMRGRERAQKVVTFCDYAYNTFQVTTGGMVLKLEQ 
G ENVFLQ ATDKNS LLGMEGANS I FSGFLLFPDMEA 


5539 


38 


1258 


HRGPSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPG " 
IVDGPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREK 
DEIYGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCS 
SDSFNEDIAAFAKQVRSERPLFSSNPELDNLVIQAIQVLRFHLL 
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ID 

NO: 
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beginning 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid secrrnpnt" mn t" a A m nrr o i nni 1 L i _ 
<A=Alanine, C=Cysteine, D=»Aspartic Acid, E« 
Glutamic Acid, F=Phenyl alanine , G^Glycine, 
H=»Histidine, I=Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P»Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, XsUnJcnown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ELEKVHDLCDNFCHRYITCLKGKMPIDLVIEDRDGGCREDFEDY 
PASCPSLPDQNNMWIRDHEDSGSVHLGTPGPSSGGLASQSGDNS 
SDQGDGLDTSVASPSSGGEDEDLDQERRRNKKRGIFPKVATNIM 
RAWLFQHLSHPYPSEEQKKQLAQDTGLTILQVNNWFINARRRIV 
QPMIDQSNRTGQGAAFSPEGQPIGGYTETQPHVAVRPPGSVGMS 
LNLEGEWHYL 


5540 


148 


1440 


PPLGAGAGVHARSPHPARRLPLTTAGVGf4RAPnT.T.DT"bwT?nwDr'" 
PSGAAAPGCALPRGOALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 
YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQVRSERPLFSSNPELDNLMIQA1QVLRFHLLELE 
KGKMP I DL VI EDRDGGCR EDFEDYPAS CPS LPDQNNT WI RDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRGIFPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
LAQDTG LT I LQ VNNW F INARRR I VQ PM I DQSNRTGQGAAF S PEG 
QPIGG YTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5541 


148 


1440 


1 c uwrwjnu v nni\c r nrrti\nijr jj i J. i-vo V ^wrCrtir lJijl_ir 1 fWKQHRG 

P S G AAA PG C ALPRGQALEG PR S CRR PQ P MARR YDE L PH Y PG I VD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 
YGH PL F P LLAL VF E KC ELATC S P RDG AG AG LGT P PGGD VCS SD S 
FNEDNT A F AKQ VRS ERPLFSSNPE LDNLM I QAI Q VLR FHLLELE 
KGKMPIDLVIEDRDGGCREDFEDYPASCPSLPDQNNIWIRDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQEPRRNKKRGIFPKVATNIMRAWLFQHLSHPYPSEEQKKQ 
LAQDTGLTI LQVNNWF I NARRR I VQPMI DQSNRTGQGAAFS PEG 
QP IGGYTETEPHVAFRAPAS VGDEFGTRKEEWHYL 


5542 


148 


1440 


PPLGAGAGVHARSPHPARRT.fcT.TTRrjVPHQ&onT'T TJT - nljD<^ijn« **"" 

PSGAAAPGCALPRGQALEGPRSCRRPQPMARRYDELPHYPGIVD 
GPAALASFPETVPAVPGPYGPHRPPQPLPPGLDSDGLKREKDEI 
YGHPLFPLLALVFEKCELATCSPRDGAGAGLGTPPGGDVCSSDS 
FNEDNTAFAKQVRS ER PLFSSNPE LDNLM I QAI Q VLR FHLLELE 
KG KM P I DLV I EDRDGGCREDFEDY PAS C PS LPDQNN I W I RDHED 
SGSVHLGTPGPSSGGLASQSGDNSSDQGVGLDTSVASPSSGGED 
EDLDQE PRRNKKRGI FP KVATN I MRAWLFQHLSHP YPS EEQKKQ 
LAQDTGLT I LQVNNW F INARRR I VQ PMIDQSNRTGQGAAFS PEG 
QPIGG YTETEPHVAFRAPASVGDEFGTRKEEWHYL 


5543 


2405 


665 


RWVREQPWPLRTSEAVKTPALRPFPGPRGVSPFPKPDWGKSPAP ' 
KRPFSDSGAFWSPERRPGVLEAPRRRPVPASFRAVPPKPTRVHG 
SSASRDRVLARTMIVADSECRAELKDYLRFAPGGVGDSGPGEEQ 
RESRARRGPRGPSAFIPVEEVLREGAESLEQHLGLEALMSSGRV 
DNLAWMGLHPDYFTS FWRLHYLT.T.HTnnPT.ac owduvt a t via a 

ARHQCSYLVGSHMAEFLQTGGDPEWLLGLHRAPEKLRKLSEINK 
LLAHRPWLITKEHIQALLKTGEHTWSLAELIQALVLLTHCHSLS 
SFVFGCGILPEGDADGSPAPQAPTPPSEQSSPPSRDPLNNSGGF 
ESARDVEALMERMQQLQESLLRDEGTSQEEMESRFELEKSESLL 
VTPSAD I LE P S PH PDMLCFVEDPTFGYEDFTRRGAQAP PTFRAQ 
DYTWEDHGYSLIQRLYPEGGQLLDEKFQAAYSLTYNTIAMHSGV 
DTS VLRRAI WN Y I HCVFG I R YDD YD YGEVNQLLERNLKVY I KTV 
ACY PEKTTRRM YNLFWRHFRHS EKVHVNLLLLEARMQAALL YAL 
RAITRYMT 


5544 


1895 


514 


LGGLLGRQRLLLRMGAGRLGAPMERHGRASATSVSSAGEQAAGD 
PEGRRQEPLRRRASSASVPAVGASAEGTRRDRLGSYSGPTSVSR 
QRVESLRKKRPLFPWFGLDIGGTLVKLVYFEPKDITAEEBEEEV 
ESLKSIRKYLTSNVAYGSTGIRDVHLELKDLTLCGRKGNLHFIR 
FPTHDMPAFIQMGRDKNFSSLHTVFCATGGGAYKFEQDFLTIGD 
LQLCKLDELDCLIKGILYIDSVGFNGRSQCYYFENPADSBKCQK 
LPFDLKNP YPLLLVN I GSGVS I LAVYS KDNYKRVTGTSLGGGTF 
FGLCCLLTGCTTFEEALEMASRGDSTKVDKLVRDIYGGDYERFG 
LPGWAVAS S FGNMMS KEKREAVS KEDLARATL I T I TNNIGS 1 AR 
MCALNEN INQ WF VGN, FLR INT I AMRLLA YALD YWS KGQLKAL F 
S EHEG Y FGAVGALLELLK I P 
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SBQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


r rcuiiicu (sua 
nucleotide 
location 
corresponding 
to first 
amino acid 

j^f* "i d n #=• erf 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H«Histidine, I«Isoleucine, K«Lysine, 
LaLeucine, M-Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 

Q = Cf»-i--j np TeThrpnninp V— Val S no 
u™ jci j. iiKZ / x *= x us. cuuiiiC , v - vaixiic , 

W=Tryptophan, Y=Tyrosine, X -Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5545 


802 


131 


GAMWSAGRGGAAWPVLIX5LLLALLVPGGGAAKTGAELVTCGSVL 
KLLNTHHRVRLHSHDIKYGSGSGQQSVTGVEASDDANSYWRIRG 
3SEGGC PRGS P VRCGQAVRLTHVLTG KNLHTHHFPS PLSNNQE V 
SAFGEDGEGDDLDLWTVRCSGQHWEREAAVRFQHVGTSVFLSVT 
GEQYGSPIRGQHEVHGMPSANTHNTWKAMEGIFIKPSVEPSAGH 
DEL 


5546 


1592 


146 


FVPRGGHSSMGQSGRSRHQKRARAQAQLRNLEAYAANPHSFVFT 

t>vj\^ x VjxvXN xx\vi_»DUiJ vi\i\ vi 1, ic»r'ij x /-w rvJ-ty VKJ\I\J>JoijJ!UJL.VAV/\Jc)Jr 

LGVTHFLILSKTETNVYFKLMRLPGGPTLTFQVKKYSLVRDVVS 
SLRRHRMHEQQFAHPPIjLVLNSFGPHGMHVKLMATMFQNLFPSI 
NVHKVNLNTIKRCLLIDYNPDSQELDFRHYSIKVVPVGASRGMK 
KLLQEKFPNMSRLQDISELLATGAGLSESEAEPDGDHNITELPQ 
AVAGRGNMRAQQSAVRLTEIGPRMTLQLIKVQEGVGEGKVMFHS 
FVSKTEEELQAILEAKEKKLRLKAQRQAQQAQNVQRKQEQREAH 
RKKSLEGMKKARVGGSDEEASGIPSRTASLELGEDDDEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKSPGRKRKRWEMDRGRGRL 
CDQKFPKTKDKSQGAQARRGPRGASRDGGRGRGRGRPGKRVA 


5547 


1592 


14 6 


FVPRGGHSSMGQSGRSRHQKRARAQAQLRNLEAYAANPHSFVFT 
KG C FGKN I RQ L S LD V R R VME P LTAS R LQ VR KKNS LKDCVAVAGP 
LGVTHFLILSKTETNVYFKLMRLPGGPTLTFQVKKYSLVRDWS 
SLRRHRMHEQQ F AH P PLLVLNS FG PHGMHVKLMATMFQN L F P S I 
NVHKVNLNT I KRCLLID YNPD SQELD FRHYS I KWPVGASRGMK 
KLLQEKFPNMSRLQDISELIxATGAGLSESEAEPDGDHNITELPQ 
A VAGR GNMRAQQSA VRL TE I G PRMTLQL I KVQEGVGEGKVMFHS 
FVS KTE EELQAI LEAKE KKLRLKAQRQAQQAQNVQRKQEQRE AH 
RKKSLEGMKKARVGGSDEEASGIPSRTASLELGEDDDEQEDDDI 
EYFCQAVGEAPSEDLFPEAKQKRLAKSPGRKRKRWEMDRGRGRL 
CDQKFP KTKDKSQGAQARRG PRGASRDGGRGRGRGR PG KRVA 


5548 


1 


2153 


DQTGPPETIAFTFPRSTMEPLCPLLLVGFSLPIiARALRGNETTA" 

DSNETTTTSGPPDPGASQPLLAWLLLPLLLLLLVLLLAAYFFRF 

RKQRKAWSTSDKKMPNGILEEQEQQRVMLLSRSPSGPKKYFPI 

PVEHLEEEIRIRSADDCKQFREEFNSLPSGHIQGTFELANKEEN 

RE KNR Y PNI LPNDHS R VI LS QLDG I P CSDYINAS YJDG YKBKNK 

FIAAQGPKQETVNDFWRMVWEQKSATIVMLTNLKERKEEKCHQY 

WPDQGCWTYGNIRVCVEDCWLVDYTIRKFCIQPQLPDGCKAPR 

T.UCnT.UTPTCWDnPfJUDPTDTPMT VI7T VVT.TVTT XT TM JTJ A »T» T T n ru n 

bvbyLrtr loWrJJr L>v± J r X rXwMxjJ\I:xjiUWj\AxxNPVnAG 

S AGVGRTGTFI VIDAMMAMMHAEQKVD VFEFVS R IRNQR PQM VQ 

TDMQYTFIYQALLEYYLYGDTELDVSSLEKHLQTMHGTTTHFDK 

I GLEE E FRKLTNVR I MKENMRTGNLPANMKKARV I QI I P YDFNR 

VI L SMKRGQE YTDY I NAS F I DG YRQKD Y F I ATQ G P LAHT VED FW 

KM 1 W h W K.aH 1 1 VMLi I L VQEREQDKCYQ YWPTEGS VTHGE I TI E I 

KNDTLSEAISIRDFLVTLNQPQARQEEQVRWRQFHFHGWPEIG 

I PAEGKGMIDLIAAVQKQQQQTGNHP ITVHCSAGAGRTGTFI AL 

SNILERVKAEGLLDVFQAVKSLRLQRPHMVQTLEQYEFCYKWQ 

DFIDIFSDYANFK 


5549 


915 


256 


Fl^TGGKRLAFKMAGTARHDREMAIQAKKKLTTATDPIERLRLQ 
CLARGSAGIKGLGRVFRIMDDDNNRTLDFKEFMKGLNDYAWME 
KEEVEELFQRFDKDGNGTIDFNEFLLTLRPPMSRARKEVIMQAF 
RKIxDKTGDGVITIEDLREVYNAKHHPKYQNGEWSEEQVFRKFLD 
NrU^F iUJvLajxjv l f fcih.r MIn i lAUVbAblDTUVYr I IMMRTAWKL 


5550 


2364 


1210 


RKRKVFLKMRRLNRKKTLST.VK RTj^AFPTO/PTT YVPT«?a dfirtTV "" 
SLx^FTTMALLTIMEFSVYQDTWMKYEYEVDKDFSSKLRINIDI 
TVAMKCQ Y VGADVLDLAETMVAS ADGLVYEPTVFDLS PQQKEWQ 
RMLQLIQSRLQEEHSLQDVIFKSAFKSTSTALPPREDDSSQSPN 
ACRIHGHLYWKVAGNFHITVGKAIPHPRGHAHLAALVNHESYN 
FSHRIDHLSFGELVpAI INPLDGTEKIAIDHNQMFQYFITWPT 
KLHT Y K I S ADTHQ F S VTERE R 1 1 NHAAGSHG VS G I FM KYD LS S L 
MVTVTEEHMPFWQFFVRLCGIVGGIFSTTGMLHGIGKFIVEIIC 
CRFRLGSYKPVNSVPFEDGHTDNHLPLLENNTH 


5551 


211 | 1700 


mqrdhtmdykescpsvsipssdehrekkkrftvykvlvsVgrse" "' 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NsAsparagine, 
PsProline, Q=Glutamine, R=Arginine, 
S^Serine, T*= Threonine, V«Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








WFVFRRYAEFDKLYNTLKKQFPAMALKtPAKRIFGDNFDPDFIK 
QRRAGLNEFIQNLVRYPELYNHPDVRAFLQMDSPKHQSDPSEDE 
DERSSQKLHSTSQNINLGPSGNPHAKPTDFDFLKVIGKGSFGKV 
LIAKRKLDGKFYAVKVLQKKIVLNRKEQKHIMAERNVLLKNVKH 
PFLVGLHYSFQTTEKLYFVLDFVNGGELFFHLQRERSFPEHRAR 
FYAAE IASALGYLHS I KI VYRDLKPENILLDSVGH WLTDFGLC 
KEGIAISDTTTTFCGTPEYLAPEVIRKQPYDNTVDWWCLGAVLY 
EMLYGLPPFYCRDVAEMYDNILHKPLSLRPGVSLTAWSILEELL 
EKDRQNRLGAKEDFLEIQNHPFFESLSWADLVQKKIPPPFNPNV 
AGPDDIRNFDTAFTEETVPYSVCVSSDYS IVNASVLEADDAFVG 
FSYAPPSEDLFL 


5E552 


2748 


930 


LGPAAGAAMGKKHKKHKAEWRSSYEDYAbKPLEKPLKLVLKVGG 
SEVTELSGSGHDSSYYDDRSDHERERHKEKKKKKKKKSEKEKHL 
DDEERRKRKEEKKRKREREHCDTEGEADDFDPGKKVEVEPPPDR 
PVRACRTQPAENESTPIQQLLEHFLRQLQRKDPHGFFAFPVTDA 
IAPGYSMIIKHPMDFGTMKDKIVANEYKSVTEFKADFKLMCDNA 
MTYNRPDTVYYKLAKKILHAGFKMMSKQAALLGNEDTAVEEPVP 
E WPVQVETAKKS KKPSREV I S CM FEPEGNACSLTDS TAEEHVL 
ALVEHAADEARDRINRFLPGGKMGYLKRNGDGSLIiYSWNTAEP 
DADEEETHPVDLSSLSSKLLPGFTTLGFKDERRNKVTFLSSATT 
AL S MQNN S V FG DL KS DE M E LL Y S AYGD ETG VQCALS LQE F VKDA 
GSYSKKWDDLLDQITGGDHSRTLFQLKQRRNVPMKPPDEAKVG 
DTLGDSSSSVLEFMSMKSYPDVSVDISMLSSLGKVKKELDPDDS 
HLNLDETTKLLQDLHEAQAERGGSRPSSNLSSLSNASERDQHHIj 
GSPSRLSVGEQPDVTHDPYEFLQSPEPAASAKT 


" 5553 


74 


1095 


LGREAVYLVSRMDGPVAEHAKQEPFHWTPLLESWALSQVAGMP 
VFLKCENVQPSGSFK1RGIGHFCQEMAKKGCRHLVCSSGGNAGI 
AAAYAARKLGIPATIVLPESTSLQWQRLQGEGAEVQLTGKVWD 
EANLRAQELAKRDGWENVPPFDHPLIWKGHASLVQELKAVLRTP 
PGALVLAVGGGGLLAG WAGLLEVGWQHVPI I AMETHGAHCFNA 
A I TAG KLVTLPDI TS VAKS LG AKTVAARALECMQVCKI HS EWE 
DTEAVSAVQQLLDDERMLVEPACGAALAAIYSGLLRRLQAEGCL 
PPS LTS VW I VCGGNN I NS RELQALKTHLGQV 


5554 


166 


2318 


CSGRTGGRGSLRPAENVCLTCKLSGAETRGLLCPAIOTWIMKVL 
GRS FFWVLFPVLPWAVQAVEHEEVAQRVI KLHRGRGVAAMQSRQ 
WVRDSCRKLSGLLRQKNAVLNKLKTAIGAVEKDVGLSDEEKLFQ 
VHTFE I FQKELNESENSVFQAVYGLQRALQGD YKDWNMKESSR 
QRLEALREAAI KEETE YMELLAAEKHQVEALKNMQHQNQSLSML 
DE I LEDVRKAADRLEEE I EEHAFDDNKS VKGVNFEAVLRVEEEE 
ANSKQNITKREVEDDLGLSMLIDSQNNQYILTKPRDSTIPRADH 
HFIKDIVTIGMLSLPCGWLCTAIGLPTMFGYIICGVLLGPSGLN 
S I KS I VQ VETLGE FG VFFTL FL VG LE FS PEKLRKVWKIS LQG PC 
YMTLLMIAFGLLWGHLLRIKPTQSVFISTCLSLSSTPLVSRFLM 
GSARGDKEGD I D YS TVLLGMLVTQD VQLGLFMAVM PTLI QAG AS 
ASSSIWEVLRILVLIGQILFSLAAVFLLCLVIKKYLIGPYYRK 
LHMES KGNKE I L I LG ISAFI FLMLTVTELLDVSMELGCFLAGAL 
VSSQGPWTEEIATSIEPIRDFLAIVFFASIGLHVFPTFVAYEL 
TVLVFLTLSVWMKFLLAALVLSLILPRSSQYlKWrVSAGLAQV 
SSFSFVLGSRARRAGVISREVYLLILSVTTLSLLLAPVLWRAAI 
TRCVPRPERRSSL 


5555 
5556 


212 
5835 


1425 
3346 


LSLRTRETPAPPRCEAASQGRVGWRADAAAEEAVRSVWNRTRDR 
GTMAPQNLSrFCLLLLYLIGAVIAGRDFYKILGVPRSASIKDIK 
KAYRKLALQLHPDRNPDDPQAQEKFQDLGAAYEVLSDSEKRKQY 
DT YGE EGL KDGHQ S SHG D I FS H F FGD FG FMFGGT PRQQDRNI PR 
GSDIIVDLEVTLEEVYAGNFVEVVRNKPVARQAPGKRKCNCRQE 
MRTTQLG PGR FQM TQE VGCDE CPNVKL VNE ER TLE VE I E PG VRD 
GMEYPFIGEGEPHVDGEPGDLRFRIKWKHPIFERRGDDLYTNV 
T ISLVES L VG FEMD I THLDGHKVHISRDKI TR PGAJCL WKKGEG L 
PNFDNNNIKGSLIITFDVDFPKEQLTEEAREGIKQLLKQGSVQK 
VYNGLQGY 

RTRGMSKNCVPMEFEEYLLRMFQGTFYLLQKITKDNNAHTVKSR 
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SEQ 
ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re spending 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, Iclsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q«Glutamine, R=Arginine, 
S^Serine, ToThreonine, V«Valine, 
W= Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEELDESYIEKFTDFLRLFVSVHLRRIESYSQFPWEFLTLLFK 
YTFHQPTHEGYFSCLDIWTLFLDYLTSKIKSRLGDKEAVLNRYE 
DALVLLLTEVLNR I Q FRYNQAQLE ELDDETLDDDQQTE WQRYLR 
QSLEWAKVMELLPTHAFSTLFPVLQDNLEVYLGLQQFIVTSGS 
GHR LN I TAENDCRR LH CS LRDLS S LLQA VGR IiAE YF I GD VFAAR 
FNDALTWERLVKVTLYGSQIKLYNIETAVPSVLKPDLIDVHAQ 
SIAALQAYSHWLAQYCSEVHRQNTQQFVTLISTTMDAITPLIST 
KVQDKLLL S ACHLL VS LATTVR PVFL I S I PA VQKVFNR I TDASA 
LRLVDKAQVLVCRALSNI LLLPWPNLPENEQQWPVRS INHASLI 
SALSRDYRNLKPSAVAPQRKMPLDDTKLIIHQTLSVLEDIVENI 
SGESTKSRQICYQSLQESVQVSLALFPAFIHQSDVTDEMLSFFL 
T L FRG LRVQMG VP FT E Q 1 1 QT FLNM FTREQ LAE S I LHEG S TGCR 
WE KFLKI LQVWQE PGQVFKP FL PS 1 1 ALCMEQVY P I IAERP S 
PDVKAELFELLFRTLHHNWRYFFKSTVLASVQRGIAEEQMENEP 
QFSAIMQAFGQSFLQPDIHLFKQNLFYLBTLNTKQKLYHKKIFR 
TAMLFQFVNVLLQVLVHKSHDLLQEEIGIAIYNMASVDFDGFFA 
AFLPEFLTSCDGVDANQKSVLGRNFKMDRVRRERGRAKRRAEWA 
RKPGTCAARRGHIEASGRGLCPPCSLAAAHEMPADLVL 


5557 


1712 


491 


VILGAGLRDKDMWIPWGLPRRLRLSALAGAGRFCILGSEAATR 
KHLPARNHCGLSDSSPQLWPEPDFRNPPRKASKASLDFKRYVTD 
RRLAETLAQ I YLGKP S RP PHLLLECNPG PG I LTQALLEAGAKW 
ALESDKTFIPHLESLGKNLDGKLRVIHCDFFKLDPRSGGVIKPP 
AMSSRGLFKNLG IEAVPWTADI PLKWGMFPSRGEKRALWKLAY 
DLYSCTS I YKFGRI EVNMFIGEKEFQKLMADPGNPDLYHVLS VI 
WQLACEIKVLHMEPWSSFDIYTRKGPLENPKRRELLDQLQQKLY 
IiIQMIPRQNLFTKNLTPMNYNIFFHLLKHCFGRRSATVIDHLRS 
LTPLDARDILMQIGKQEDEKWNMHPQDFKTLFETIERSKDCAY 
KWLYDETLEDR 


5558 


1509 


96 


RAGCTHPQVPADLGAPAEPRRPQKTCVCLLQPQPGGQRGPTTMI 
TGVFSMRLWTPVGVLTSLAYCLHQRRVALAELQEADGQCPVDRS 
LLKLKMVQWFRHGARSPLKPLPLEEQVEWNPQLLEVPPQTQFD 
YTVTNLAGGPKPYSPYDSQYHETTLKGGMFAGQLTKVGMQQMFA 
LGERLRKNYVEDIPFLSPTFNPQEVFIRSTNIFRNLESTRCLLA 
GLFQCQKEGPI I IHTDEADSEVLYPNYQSCWSLRQRTRGRRQTA 
SLQPGISEDLKKVKDRMGIDSSDKVDFFILLDNVAAEQAHNLPS 
CPMLKRFARMIEQRAVDTSLYILPKEDRESLQMAVGPFLHILES 
NLLKAMDSATAPDKIRKLYLYAAHDVTFIPLLMTLGIFDHKWPP 
FAVDLTMELYQHLESKEWFVQLYYHGKEQVPRGCPDGLCPLDMF 
LNAMS VYTLS PEKYHALCSQTQVMEVGNEE 


5559 


150 


1983 


PLAATAHFAKMSRVAKYRRQVSEDPDIDSLLETLSPEEMEELEK 
ELD WD PDGS VP VGLRQRNQTE KQS TG VYNR EAM LNFCEKE TKK 
LMQREMSMDESKQVETKTDAKNGEERGRDASKKALGPRRDSDLG 
KE P KRGGLKKS FSRJDRDEAGGKSGEKP KEEKI I RG I DKGRVRAA 
VDKKEAGKDGRGEERAVATKKEEEKKGSDRNTGLSRDKDKKREE 
MKEVAKKEDDEKVKGERRNTDTRKEGEKMKRAGGNTDMKKEDEK 
VKRGTGNTDTKKDDEKVKKNEPLHEKEAKDDSKTKTPEKQTPSG 
PTKPS EGPAKVEEEAAPS I FDEPLERVKNNDPEMTE VNVNNSDC 
ITNE I LVRFTEALEFNTWKLFALANTRADDHVAFAI AIMLKAN 
KT I TS LNLD5 NH I TG KG I LA I FRALLQNNTLTE LR FHNQRH I CG 
GKTEMEIAKLLKENTTLLKLGYHFELAGPRMTVTNLLSRNMDKQ 
RQ KRLQEQRQAQ EAKGE KKDLLE VP KAG AVAKG S P KPS PQ P S P K 
PSPKWSPKKGGAPAAPPPPPPPLAPPLIMENLKNSLSPATQRKM 
GDKVLPAQEKNSRDQLLAAIRSSNLKQLKKVEVPKLLQ 


5560 


9 


921 


SSWEFSALSVSMACLSPSQLQKFQQDGFLVLEGFLSAEECVAM 
QQRIGEIVAEMDVPLHCRTEFSTQEEEQLRAQGSTDYFLSSGDK 
IRFFFEKGVFDEKGNFLVPPEKSINKIGHALHAHDPVFKSITHS 
FKVQTLARSLGLQMPVWQSMYIFKQPHFGGEVSPHQDASFLYT 
E P LGR VLG VW I AVEDATLENG CLW F I PGS HTSG VS RRMVRAP VG 
SAPGTSFLGSEPARDNSLFVPTPVQRGALVLIHGEWHKSKQNL 
SDRSRQAYTFHLMEASGTTWSPENWLQPTAELPFPQLYT 


5561 


2175 


1775 


CYFIFQFFSSPY PGLHPHQTPAPLPNPGL YPP P VS MS PGQP PPQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pept ide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine , G^Glycine, 
H^Histidine, I=*Isoleucine, K*»Lysine, 
L=Leucine, M«Methionine, NaAsparagine , 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V=*Valine, 
W=Tryptophan, Y=Tyrosine, X^unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








QLLAPTYFSAPGVMNFGNPSYPYAPGALPPPPPPHLYPNTQAPS 
QVYGGVTYYNPAQQQVQPKPSPPRRTPQPVTIKPPPPEWSRGS 

s 


5562 


342 


1385 


SSGKNDMAAAGAAGLVRGLKAGVLSQADYLNLVQCETLEDLKLH 
LQSTDYGNFLANEASPLTVSVIDDRLKEKMWEFRHMRNHAYEP 
LASFLDFITYSYMIDNVILLITGTLHQRSIAELVPKCHPLGSFE 
QMEAVNI AQTPAELYNAILVDTPLAAFFQDCIS EQDLDEMNI E I 
IRNTL YKAYLES FYKFCTLLGGTTADAMCP I LE FEADRRAFI IT 
INS FGTELS KEDRAKL FPHCGRL YPEGLAQLARADDYEQVKNVA 
D Y Y P E YKLL FEG AGS N PGDKTLEDRF F EH E VKLNKLAFLNQFHF 
GVFYAFVKLKEQECRNIVWIAECIAQRHRAKIDNYIPIF 


5563 


342 


1385 


SS G KNDMAAAGAAGL VRGLKAG VLSQAD YL^L VQ CETLEDLKLH 
LQSTDYGNFLANEASPLTVSVIDDRLKEKMWEFRHMRNHAYEP 
laASFLDFITYSYMIDNVILLITGTLHQRSIABLVPKCHPLQSPE 
QMEAVNIAQTPAELYNAILVDTPLAAFFQDCISEQDLDEMNIEI 
IRNTLYKAYLES FYKFCTLLGGTTADAMCP I LE FEADRRAFI IT 
INS FGTELS KE0RAKLFPHCGRL YPEGLAQLARADDYEQVKNVA 
D Y Y PE Y KLL F EG AGSNPGD KTLEDR F FE HE V KLN KLAFLNQ FHF 
GVFYAFVKLKEQECRNIVWIAECIAQRHRAKIDNYIPIF 


5564 


3 


914 


RVRRDKRAVWTARGRRRCGDSMSGGWMAQVGAWRTGALGLALLL 
LLGLGLGLEAAASPLSTPTSAQAAGPSSGSCPPTKFQCRTSGLC 
VPLTWRCDRDLDCSDGSDEEECRIBPCTQKGQCPPPPGLPCPCT 
G VS DCS GOT D KKLRNC S RLACLAG E LRCTLSDD C I P LTWR CDG H 
PDCPDSSDELGCGTNEILPEGDATTMGPPVTLE^VT^T.PNJlTTM 
GPPVTLES VPS VGNATS S SAGDQSGS PTAYG VI AAAAVLSASLV 
TATLLLLSWLRAQERLRPLGLLVAMKESLLLSEQKTSLP 


5565 


993 


138 


RWNSPNPARAGSISRPQRAPGSVSAVAMTAAVFFGCAFIAFGPA 
LAL Y V FT I ATB P LR 1 1 F LI AG AFFWLVS LLI S SL VW FMAR V 1 1 D 
NKDGPTQKYLLI FGAFVSVYIQEMFRFAYYKLLKKASEGLKS IN 
PGETAPSMRLLAYVSGLGFGIMSGVFSFVNTLSDSLGPGTVGIH 
GDSPQFFLYSAFMTLVIILLHVFWGIVFFDGCEKKKWGILLIVL 
LTHLLVS AQTF I S S YYGINLASAFI I LVLMGTWAFLAAGGS CRS 
LKLCLLCQDKNFLLYNQRSR 


55*6 " 


2043 


1232 


SHIQHHGRGAQAPVKMVSWMI SRAWLVFGMLYPAYYS YKAVKT " 

KNVKEYVRWMMYWIVFALYTVIETVADQTVAWFPLYYELKIAFV 

IWLLSPYTKGASLIYRKFLHPLLSSKEREIDDYIVQAKERGYET 

MVNFGRQGLNLAATAAVTAAVKSQGAITERLRSFSMHDLTTIQG 

DEPVGQRPYQPLPEAKKKSKPAPSESAGYGIPLKDGDEKTDEEA 

EGPYSDNEMLTHKGPRRSQSMKSVKTTKGRKEVRYGSLKYKVKK 

RPQVYF 


5567 


1554 


233 


EFLGSGVSPDLANEDGLTALHOnCTDDPRPMl/nnT t PariaMTKnv " 
CDSECWTPLHAAATCGHLHLVELLIASGANLLAVNTDGNMPYDL 
CDDEQTLDCLETAMADRGI TQDS IEAARAVPELRMLDDIRSRLQ 
AGADLHAPLDHGATLLHVAAANGFSEAAALLLEHRASLSAKDQD 
GWEPLHAAAYWGQVPLVELLVAHGADLNAKSLMDETPLDVCGDE 
EVRAKLLELKHKHDALLRAQSRQRSLLRRRTSSAGSRGKWRRV 
S LTQRTDL YRKQHAQE AI VWQQP PPTS P EP P EDNDDRQTGAE LR 
PPPPEEDNPEWRPHNGRVGGSPVRHLYSKRLDRSVSYQLSPLD 
S TT PHTL VHD KAHHTLADLKRQRAAAKLQR P P P EG PE S P ETAE P 
GLPGDTVTPOPDCGFRAGRnPPT.T.TfT,TAPaVPRD\/RPODr , r , T t m 


5568 


1731 


587 


AEDRQPASRRGAGTTAAMAASGPGCRSWCLCPSVPSATFFTALL 
SLLVSGPRLFLLQQPLAPSGLTLKSEALRNWQVYRLVTYIFVYE 
NPISLLCGAIIIWRFAGNFERTVGTVRHCFFTVIFAIFSAIIFL 
S FE AVS S LS KLGEVEDARGFTP VAFAMLGVTTVRSRMRRALVFG 
MWPSVLVPWLLLGASWLIPQTSFLSNVCGLSIGLAYGLTYCYS 
I DL S ER VAL KLDQT F P FS LMRR I S VFK YVSGS S AERRAAQ S RKL 
NPVPGSYPTQSCHPHLSPSHPVSQTQHASGQKLASWPSCTPGHM 
PTLPPYQPASGLCYVQNHFGPNPTSSSVYPASAGTSLGIQPPTP 
VNS PGTVYSGALGTPGAAGSKESS R VPMP 


5569 


2 


835 


QTPCPLAWERGSRSEDISVPGQKPPTCSSFSGMDVGPSSLPHLG 
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SEQ 
ID 
NO: 


tr X.cUli.tcU 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rlt.UJ.LLcU trllU 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rtjii-Lriu atiu sfcjymenu couLdinmy signal pepuius 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G«Glycine, 
H=Histidine, I»Isoleucine , K«Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine. T=Threonine. Vt= Valine 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKLLLLLLLLPLRGQANTGCYGIPGMPGLPGAPGKDGYDGLPGP 
KGEPGIPAIPGIRGPKGQKGEPGLPGHPGKNGPKGPPGMPGVPG 
PMGIPGEPGEEGRYKOKFO^VFTVTROTHOPPAPN^T»TRFNM/T. 
TNPQGDYDTSTGKFTCKVPGLYYFVYHASHTANLCVLLYRSGVK 
WTFCGHTSKTNQVNSGGVLLRLQVGBEVWLAVNDYYDMVGIQG 
SDSVFSGFLLFPD 


5570 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MS S PS PG KRRMDTD WKL I ES KHEVT I LGGLNE F WKF YGP QGT 

DVINQTWTALYDLTNIFESFLPQLIAYPNPIDPLNGDAAAMYLH 
RPEEYKQKIKEYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5571 


264 


946 


RDRRDRGGVATSTEEPARPRAPQSRGPGPVSQTGRGRERGGGDT 
MSSPS PGKRRMDTDWKL1 ES KHEVT I LGGLNE F WKF YG P QGT 

73YPnrtVWTf\7RVT)T«PnVVPPVQDQTf'3T?M'K!K'T P'UDTJTrkPIvCr'TWr'T 
tr i Quu v rMiv, v t\ V ULi JtUIn J. e C IV Cj ro A V7 r rlW i\X r rit W J.L/E>Moo 1 V 

DVIJNQTWTALYDLTNIFESFLPQLLAYPNPIDPLNGDAAAMYLH 
RPEEYKQKIKEYIQKYATEEALKEQEEGTGDSSSESSMSDFSED 
EAQDMEL 


5572 


2802 


2085 


RTDYRTG I PGRRFRVMAAGDGDVKLGTLGSGSES SNDGGSES PG 
DAGAAAEGGGWAAAALALLTGGGEMLLNVALVALVLLGAYRLWV 
RWGRRGLGAGAGAGSESPATSLPRMKKRDFSLEQLRQYDGSRNP 
RI LLAVNGKV FDVTKGS KF YG PAG P YG I FAGRDAS RGLAT F CLD 
KDALRDEYDDLSDLNAVQMESVREWEMQFKEKYDYVGRLLKPGE 
EPSEYTDEEDTKDHNKQD 


557} 


2562 


219 


VPART PNAEDQG PE ARAATATPCQSGGRERAGEAAEDGVKMAAF 
SEMGVMPE I AQAVEEMDWLLPTDIQAE S I PL I LGGGDVLMAAET 
GSGKTGAFS I PVIQI VYETLKDQQEGKKGKTTI KTGASVLNKWQ 
MNPYDRGSAFAIGSDGLCCQSREVKEWHGCRATKGLMKGKHYYE 
VSCHDQGLCRVGWSTMQASLDLGTDKFGFGFGGTGKKSHNKQFD 
NYGEEFTMHDTIGCYLDIDKGHVKFSKNGKDLGLAFEIPPHMKN 
QALFPACVLKNAELKFNFGEEEFKFPPKDGFVALSKAPDGYIVK 
SQHSGNAQVTQTKFLPNAPKALIVEPSRELAEQTLNNIKQFKXY 
1 UN i 3 JUjKE uu 1 1 GG VAAKDQ LS VLENG VD I WGT P GRLDD L VS T 
GKLNLSQVRFLVLDEADGLLSQGYSDFINRMHNQIPQVTSDGKR 
LQVIVCSATLHSFDVKKLSEKIMHFPTWVDLKGEDSVPDTVHHV 
VVPWPKTDRLWERLGKSHIRTDDVHAKDNTRPGANSPEMWSEA 
IKILKGEYAVRAIKEHKMDQAIIFCRTKIDCDNLEQYFIQQGGG 
PDKKGHQFSCVCLHGDRKPHERKQNLERFKKGDVRFLICTDVAA 

T?r2TnT Un\rDV\7Tl\T\/TT.DnPVri\TV\7UD Tr'D\fr , T37iT?'Dry>r*T TVTCT 1T7\ 

Kuiuinuvrl v J.J.M v 1 LifUcj JxyrJ x VrlKaoKVoKACiKl'lol-iAlo J_»vA 
TEKEKVWYHVCSSRGKGCYNTRLKEDGGCTIWYNEMQLLSEIEE 
HI^CTISQ\^PDIKVPVDEFDGKVTYGQKRAAGGGSYKGHVI>IL 
APTVQELAALE KE AQTS F LHLG YL PNQ L FRTF 


5574 


1731 


952 


NEGLEVFKEQELQPEDKGAVPEDASTERSAMASLGLQLVGYILG 
LLGLLGTLVAMLLPSWKTSSYVGASIVTAVGFSKGLWMECATHS 
TGITQCDIYSTLLGLPADIOAAQAMMVTSSAISSLACIISWGM 
RCTVFCQESRAKDRVAVAGGVFFILGGLLGFIPVAWNLHGILRD 
FYSPLVPDSMKFEIGEALYLGIISSLFSLIAGIILCFSCSCQRN 
RSNYYDAYQAQPLATRSSPRPGQPPKVKSEFNSYSLTGYV 


5575 


456 


766 


LLWALPCPPPTAAAVLL3STGLMELLEKMLALTLAKADSPRTAL 
LCSAWLLTASFSAQQHKGSLQKDPLLSQACVGCLEALLDYLDAR 

PDTC1PN9 PHYT.MFP 


5576 


249 


2146 


RSWGAPWFWPJV1RLLRRPJ1MPLRLAMVGCAFVLFLFLLHRDVSSR 
EEATEKPWLKSLVSRKDHVLDL^F*AMNNLRDSMPKLQIRAPEA 
QQTLFSINQSCLPGFYTPAELKPFWERPPQDPNAPGADGKAFQK 
SKWTPLETQEKEEGYKKHCFNAFASDRISLQRSLGPDTRPPECV 
DQKFRR CP PLATTS V 1 1 VFHNE AWS TL L RTVYS VLHTTPA I LLK 
EIILVDDASTEEHLKEKLEQYVKQLQWRWRQEERKGLITARL 
LGASVAQAEVLTFLDAHCE CFHGWLEPLLARI AEDKTVWS PD I 
VTI DLNT FE FAKPVQRGR VHS RGNFD W S LTFGWB TLP PHEKQRR 
KDET Y P I KS PTFAGGL FS I S KS YFEH I GT YDNQM E I WGGENVEM 
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1 SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptid"e 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidine, I=isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine, T»Threonine, V-Valine, 
W«Tryp tophan, Y«Tyrosine, X=Vnknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SFRVWQCGGQLEIIPCSWGHVFRTKSPHTFPKGTSVIARNQVR 
LAEVWMDSYKKIFYRRNLQAAKMAQEKSFGDISERLQLREQLHC 
HNFSWYLHNVYPEMFVPDLTPTFYGAIKNLGTNQCLDVGENNRG 
GKPLIMYSCHGLGGNQYFEYTTQRDLRHNIAKQLCLHVSKGALG 
LGSCHFTGKNSQVPKDEEWELAQDQLIRNSGSGTCLTSQDKKPA 
MAPCNPSDPHQLWLFV 


5577 


3 


1275 


RNSDCSCGEISVHCLPWVLFILDLKVES^MFCPtKLILLPVLLD " 

YSLGLNDLNVSPPELTVHVGDSALMGCVFQSTEDKCIFKIDWTIi 

SPGEHAKDEYVLYYYSNLSVPIGRFQNRVHLMGDILCNDGSLLL 

QDVQEADQGTYICEIRLKGESQVFKKAWLHVLPEEPKELMVHV 

GGLIQMGCVFQSTEVKHVTKVEWIFSGRRAKEEIVFRYYHKLRM 

SVEYSQSWGHFQNRVNLVGDIFRNDGSIMLQGVRESDGGNYTCS 

I HLGNLVFKKTI VLHVSPEEPRTLVTPAALRPLVLGGNQLVI IV 

GI VCAT I LLLPVL I L I VKKTCGNKS S VNSTVLVKNTKKTNPE I K 

EKPCHFERCEGEKHIYSPIIVREVIEEEEPSEKSEATYMTMHPV 

WPS LRS DRNNS LE KKS GGGM P KTQQ AF 


5578 


3 ■ " 


783 


AVESMASPGAGRAPPELPERNCGYREVEYWDQRYQGAADSAPYD 
WFGDFSSFRAIiLEPELRPEDRILVLGCGNSALSYELFLGGFPNV 
TSVDYSSVWAAMQARYAHVPQLRWETMDVRKLDFPSASFDWL 
EKGTLDALLAGERDPWTVSSEGVHTVDQVLSEVSRVLVPGGRFI 
SMTSAAPHFRTRHYAQAYYGWSLRHATYGSGFHFHLYLMHKGGK 
LSVAQLALGAQILSPPRPPTSPCFLQDSDHEDFLSAIQL 


5579 


3 


1540 ■ " 


RNSGLARGASALARHGGGLAGGVGWDCGACASRCQGVMEGLLTR 
CRALPALATCSRQLSGYVPCRFHHCAPRRGRRLLLSRVFQPQNL 
REDRVLSLODKSDDLTCKSORIiMLOVC;T_iTYPZk«;pnpVMT t dvtu 

RAME KLVRVI DQEMQAI GGQKVNM PSLS PAELWQATNRWDLMG K 
ELLRIiRDRHGKEYCLGPTHEEAITAIilASQKKIiSYKQLPFLLYQ 
VTRKFRDEPRPRFGLLRGREFYMKDMYTFDSSPEAAQQTYSLVC 
DAYCSLFNKLGLPFVKVQADVGTIGGTVSHEFQLPVDIGEDRLA 
ICPRCSFSANMETLDLSQMNCPACQGPLTKTKGIEVGHTFYLGT 
KYSSI FNAQ FTNVCG KPTLAEMG C YGLG VTR I LAAAI EVLSTED 
CVRWPSLLAPYQACLIPPKKGSKEQAASELIGQLYDHITEAVPQ 
LHGEVLLDDRTHLTIGNRLKDANKFGYPFVIIAGKRALEDPAHF 
EVWCQNTGEVAFLTKDGVMDLLTPVQTV 


5580 j 


1681 


450 


ADAGTRC I PGFWPSGAGYSAPAQRGRRSSGRMRAAAAPGLTAP 
WRLLQCCELEAGELGMAVPAAAMGPSALGQSGPGSMAPWCSVSS 
G PS R YVLGMQEL FRGHS KTREFLAHSAKVHS VAWS CDGRRLASG 
SFDKTASVFLLEKDRLVKENNYRGHGDSVDQLCWHPSNPDLFVT 
AS GD KT I R I WD VRTTKC I AT VNTKGEN I N I CW S PDGQT I AVGNK 
DD WT F I DAKTHRS KAE E Q FKFE VNE I S WNNDNNMFFLTNGNG C 
INILSYPELKPVQSINAHPSNCICIKFDPMGKYFATGSADALVS 
L WD VDELVCVR CFS RLD WP VR TLS FSHDG KMLASAS EDHF IDI A 
EVETGDKLWEVQCESPTFTVAWHPKRPLLAFACDDKDGKYDSSR 
EAGTVKLFGLPNDS 


5581 


54 


947 


GGGSGPRAPSATLLDTGESVAAVASGEDKGIAASAAAAAVFACS 
CSPDPQSSTMNPVYSPVQPGAPYGNPKNMAYTGYPTAYPAAAPA 
YNPSLYPTNSPSYAPEFQFLHSAYATLLMKQAWPQNSSSCGTEG 
TFHLPVDTGTENRTYQASSAAFRYTAGTPYKVPPTQSNTAPPPY 
S PSPNP YQTAM YP IRSA YPQQNL YAQGA YYTQP VYAAQPHVIHH 
TTWQ PNS I P S AI YPAPVAAPRTNG VAMGMVAGTTMAMSAGTLL 
TTPQHTAIGAHPVSMPTYRAQGTPAYSYVPPHW 


£582 


5775 


2739 


I ITNNNNVI I PLVI AYHLSGSAQARGERS PAERLMERQKRKADI 
EKGLQFIQSTLPLKQEEYEAFLLKLVQNLFAEGNDLFREKDYKQ 
ALVQYMEGIWADYAASDQVALPRELLCKLHVNRAACYFTMGLY 
EKALEDSE KALGLDS ES IRALFRKARALNELGRHKEAYECS SRC 
SLALPHDESVTQLGQELAQKLGLRVRKAYKRPQELETFSLLSNG 
TAAGVADQGTSNGLGSIDDIETDCYVDPRGSPALLPSTPTMPLF 
PHVLDLLAPLDSSRTLPSTDSLDDFSDGDVFGPELDTLLDSLSL 
VQGGLSGSGVPSELPQIilPVFPGGTPLLPPWGGSIPVSSPLPP 
ASFGLVMDPSKKLAASVLDALDPPGPTLDPLDLLPYSETRLDAL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SsSerine, T=Threonine, V«Valine, 
W^Tryptophan, Y«Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DSFGSTRGSLDKPDSFMEETNSQDHRPPSGAQKPAPSPEPCMPN 
T AL L I KN P IoAATH B F KQACQLC Y P KTG PRAGD YT Y REG LEHKCK 
RDILLGRLRSSEDQTWKRIRPRPTKTSFVGSYYLCKDMINKQDC 
KYGDNCT F A YHQ EE I D VW TEE R KGTLNRDLL FDPLGG V KRG S LT 
IAKLLKEHQGI FTFLCEICFDSKPRI ISKGTKDSPSVCSNLAAK 
HSFYNNKCLVHIVRSTSLKYSKIRQFQEHFQFDVCRHEVRYGCL 
REDS CHFAHS F I ELKVWLLQQYSGMTHED I VQES KKYWQQMEAH 
AGKASSSMGAPRTHGPSTFDLQMKFVCGQCWRNGQVVEPDKDLK 
YCSAKARHCWTKERRVLLVMSKAKRKWVSVRPLPSIRNFPQQYD 
LCIHAQNGRKCQYVGNCSFAHSPEERDMWTFMKENKILDMQQTY 
DMV3LKKHNPGKPGEGTPISSREGEKQIQMPTDYADIMMGYHCWL 
CGKNSNSKKQWQQHIQSEKHKEKVFTSDSDASGWAFRFPMGEFR 
LCDRLQKGKACPDGDKCRCAHGQEELNEWLDRREVLKQKLAKAR 
KDMLLCPRDDDFGKYNFLLQEDGDLAGATPEAPAAAATATTGE 


5583 


3 


1265 


S S G CRQGR PGR SDR PR P P PRRH KM VKETR Y YD I LG VK PS AS PE E 
IKKAYRKLALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGSPSFSSPMDIFDMFFGGGGRMARERRGKNW 
HQLSVTLEDLYNGVTKKLALQKNVICEKCEGVGGKKGSVEKCPL 
CKGRGMHIHIQQIGPGMVQQIQTVCIECKGQGERINPKDRCESC 
SGAKVI REKKI I EVHVEKGMKDGQKI LFHGEGDQEPELEPGDVI 
IVLDQKDHSVFQRRGHDLIMKMKIQLSEALCGFKKTIKTLDNRI 
LVITSKAGEVIKHGDLRCVRDEGMPIYKAPLEKGILIIQFLVIF 
PEKHWLSLEKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5584 


3 


12^5 


SSGCRQGRPGRS DRPRPP PRRHKMVKETR YYD I LGVKPSAS PEE 
IKKAYRKLALKYHPDKNPDEGEKFKLISQAYEVLSDPKKRDVYD 
QGGEQAIKEGGSGSPSFSSPMDIFDMFFGGGGRMARERRGKNW 
HQLSVThEDLYmVTKKLALQXmriCEKCEGVGGKKGSVEKCPh 
CKGRGMHIHIQQIGPGMVQQIQTVCIECKGQGERINPKDRCESC 
SGAKVIREKKIIEVHVEKGMKDGQKILFHGEGDQEPELEPGDVI 
IVLDQKDHSVFQRRGHDLIMKMKIQLSEALCGFKKTIKTLDNRI 
LVI TS KAGEVI KHGDLRC VRDEGMP I YKAPLE KG I LI IQFL VI F 
PEKHWLSLEKLPQLEALLPPRQKVRITDDMDQVELKEFCPNEQN 
WRQHREAYEEDEDGPQAGVQCQTA 


5585 


2619 


915 


LPAGTPESSLHEALDQCMTALDLFLTNQFSEALSYLKPRTKESM 
YHSLTYATILEMQAMMTFDPQDILLAGNMMKBAQ'MLCQRHRRKS 
SVTDSFSSLVNRPTLGQFTEEEIHAEVCYAKCLLQRAALTFLQD 
ENMVSFIKGGIKVRNSYQTYKELDSLVQSSQYCKGENHPHFEGG 
VKLGVGAFNLTLSMLPTRILRLLEFVGFSGNKDYGLLQLEEGAS 
GHSFRSVLCVMLLLCYHTFLTFVLGTGNVNIEEAEKLLKPYLNR 
YPKGA I FL FLAG R I E VI KGN I DAAI RR FE ECCEAQQHWKQFHHM 
CYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDE VELFRAVPGLKLKIAGKS LPTE KFAI RKS 
RRYFSSNPISLPVPALEMMYIWNGYAVIGKQPKLTDGILEIITK 
AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
ISANEKKIKYDHYLIPNALLELALLLMEQDRNEEAIKLLESAKQ 
N YKN YS ME S RTH FR I QAATLQAKSSLENS S RSMVS S VS L 


5586 


2619 


915 


LPAGTPESSLHEALDQCMTALDLFLTNQFSEALSYLKPRTKESM 
YHSLT YAT I LEMQAMMTFDPQD I LLAGNMMKEAQMLCQRHRRKS 
SVTDSFSSLVNRPTLGQFTEEEIHAEVCYAKCLLQRAALTFLQD 
ENMVSFIKGGIKVRNSYQTYKELDSLVQSSQYCKGENHPHFEGG 

V KLGVG AFNLTLS ML PTR I LRLLE F VG FS GNKD YGLLQ LEEG AS 
GHS FRS VLCVMLLLC YHTFLT FVLGTGNVN I EEAE KLLKP YLNR 

Y P KG A I F LFLAGR I E V I KGN I DAAI RR FEE C CEAQQHW KQFHHM 
CYWELMWCFTYKGQWKMSYFYADLLSKENCWSKATYIYMKAAYL 
SMFGKEDHKPFGDDEVELFRAVPGLKLKIAGKSLPTEKFAIRKS 
RRYFSSNPISLPVPALEMMYIWNGYAVIGKQPKLTDGILEIITK 
AEEMLEKGPENEYSVDDECLVKLLKGLCLKYLGRVQEAEENFRS 
ISANEKK I K YDH YL I PNALLELALLLMEQDRNEEAI KLLESAKQ 
NYKNYSMESRTHFRIQAATLQAKSSLENSSRSMVSSVSL 


5587 


1768 


148 


SSAVPDGAVC5RPVAVAVGGPPHSCRCRPCCLMAAIGVHLGCTSA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alaxiine, C=Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine # 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S^Serine, T«=Threonine, V=Valine, 
W=Tryptophan, Y»Tyroaine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








CVAVYKDGRAGWANDAGDRVTPAWAYSENEE1VGLAAKQSRI 
RNISNTVMKVKQILGRSSSDPQAQKYIAESKCLVIEKNGKLRYE 
IDTGEETKFVNPEDVARLI FS KMKETAHS VLGS DAND W I TVP F 
DFGEKQKNALGEAARAAGFNVLRLIHEPSAALLAYGIGQDS PTG 
KSNILVFKLGGTSLSLSVMEVNSGIYRVLSTNTDDNIGGAHFTE 
TLAQYLAS EFQRS FKHDVRGNARAMMKLTNS AE VAKHSLSTLGS 
ANCFLDSLYEGQDFDCNVSRARFELLCSPLFNKCIEAIRGIjLDQ 
NGFTADDINKWLCGGSSRIPKLQQLIKDLFPAVELLNSIPPDE 
VIPIGAAIEAGILIGKENLLVEDSLMIECSARDILVKGVDESGA 
SRFTVLFPSGTPLPARRQHTLQAPGSISSVCLELYESDGKNSAK 
EETKFAQWLQDLDKKENGLRDILAVLTMKRDGSLHVTCTDQET 
GKCEAISIEIAS 


5588 


3 


589 


TP P P P EQAMVAAT VAAAWLLL W AAACAQQ EQDF Y D F KAVN 1 RGK 
LVSLEKYRGSVSLWNVASECGFTDQHYRALQQLQRDLGPHHFN 
VLAFPCNQFGQQEPDSNKEIESFARRTYSVSFPMFSKIAVTGTG 
AHPAFKYLAQTSGKEPTWNFWKYLVAPDGKWGAWDPTVSVEEV 
RPQITALVRKLILLKREDL 


5589 


1884 


553 


LRQAWHEGGIGQTDKERGAAALPGEEGDPTRGRSLGRASWESGS 
PRRPRSPFSSFLPRPICLSLEARPCSIEDRRNWSLIGRPGAPAS 
GLNRSSGLWLGPDRCRPRSRCSCRVMENPSPAAALGKALCALLL 
ATLGAAGQPLGGES I CSARAPAKYS ITFTGKWSQTAFPKQYPLF 
RPPAQWSSLLGAAHSSDYSMWRKNQYVSNGLRDFAERGEAWALM 
KEIEAAGEALQSVHAVFSAPAVPSGTGQTSAELEVQRRHSLVSF 
WRIVPSPDWFVGVDSLDLCDGDRWREQAALDLYPYDAGTDSGF 
TFSSPNFATIPQDTVTEITSSSPSHPANSFYYPRLKALPPIARV 
TLLRLRQSPRAF1PPAPVLPSRDNEIVDSASVPETPLDCEVSLW 
SSWGLCGGHCGRLGTKSRTRYVRVQPANNGSPCPELEEEAECVP 
DNCV 


5590 


72 


896 


LCSSGALRLLPAMVAWRSAFLVCLAFSLATLVQRGSGDFDDFNL 
EDAVKETSSVKQPWDHTTTTTTORPGTTRAPAKPPGSGLDLADA 
LDDQDDGRRKPG I GGRERWNHVTTTTKR P VTTRAPANTLGNDFD 
LADALDDRNDRDDGRRKP I AGGGGFSD KDLED I VGGGE YKPDKG 
KG DGR YGSNDDPGSGM VAE PGT I AGVASALAMAL I GAVSS Y I S Y 
QQKKFCFSIQQGLNADYVKGENLEAWCEEPQVKYSTLHTQSAE 
PPPPPEPARI 


5591 


68 


1494 


AGSSRRAAAERLLVSAGCRSLAGRASGVLLLPAELLPGEEEAMA 
LRVTRNS KINAENKAK I NMAGAKRVPTAP AATSKPGLR PRTALG 
DIGNKVSEQLQAKMPMKKEAKPSATGKVIDKKLPKPLEKVPMLV 
PVPVSEPVPEPEPEPEPEPVKEEKLSPEPILVDTASPSPMETSG 
CAPABEDLCQAFSDVILAVNDVDAEDGADPNLCSEYVKDIYAYL 
RQLEEEQAVRPKYLLGREVTGNMRAI L I DWLVQVQMKFRLLQET 
MYMTVS I I DRFMQNNCVPKKMLQLVGVTAMFIASKYEEM YPPEI 
GD FAFVTDNT YTKHQ I RQMEMKI LRALNFGLGRPL PLH FLRRAS 
KIGEVDVEQHTLAKYLMELTMLDYDMVHFPPSQIAAGAFCLALK 
ILDNGEWTPTLQHYLSYTEESLLPVMQHLAKNAAMVNQGLTKHM 
TVKNKYATS KHAKI STL PQLNSALVQDLAKAVAKV 


5592 


242 


924 


YGESKDWNQKDLLSALVLTTVNCLPTPIWAKSAEVKLAIFGRAG 
VGKSALWRFLTKRFIWEYDPTLESTYRHQATIDDEWSMEILD 
TAGQEDTIQREGHMRWGEGFVLVYDITDRGSFEEVLPLKNILDE 
I KKPKNVTLI LVGNKADLDHSRQVSTEEGEKLATELACAFYECS 
ACTGEGNITEIFYELCREVRRRRMVQGKTRRRSSTTHVKQAINK 
MLTKISS 


5593 


3 


1113 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH 
SSGIVADLSEQSLKDGEERGEEDPEEEHELPVDMETINLDRDAE 
DVDLNHYRIGKIEGFEVLKKVKTLCLRQNLIKCIENLEELQSLR 
ELDL YDNQ I KKI ENLEALTELE I LDI SFNLLRNI EGVDKLTRLK 
KLFLVNNKISKIENLSNLHQLQMLELGSNRIRAIENIDTLTNLE 
SLFLGKNKITKLQNLDALTNLTVLSMQSNRLTKIEGLQNLVNLR 
ELYLSHNGIEVIEGLENNNKLTMLDIASNRIKKIENISHLTELQ 
EFWMNDNLLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, Idsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
PeProline, Q=Glutamine, R«Arginine, 
S=Serine, T-Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /t=possible nucleotide deletion, 
\=possible nucleotide insertion) 








MLAL PS VRQ I DATFVRF 


5594 


3 


1113 


HASGGRAANMAAERGAGQQQSQEMMEVDRRVESEESGDEEGKKH "' ' 
SSGIVADLSEQSLKDGEERGEEDPEEEHBLPVDMETINLDRDAE 
DVDLNHYRIGKIEGFEVXjKKVKTLCLRQNLIKCIENLEELQSLR 
ELDLYDNQ I KK I ENLEALTELE I LD I S FNLLRN I EGVDKLTRLK 
KLFLVNNKISKIENLSNLHQIjQMIjELGSNRIRAIENIDTLTNLE 
SL FLGKNK I TKLQNLDALTNLTVLSMQSNRLTKI EGLQNLVNLR 
ELYLSHNG I E V I EGLENNNKLTMLD I ASNR I KKI ENI SHLTELQ 
EFWMNDNLLESWSDLDELKGARSLETVYLERNPLQKDPQYRRKV 
MLALPSVRQI DATFVRF 


5595 


3 


1476 


ARWNGRWVQVPAWPGPGCGTNASGEI^QRQLPRAWRPVGRTLGSE 
PIALAWSPPLYLFPIPLPSWAVSQPTPTLGTMFADLDYDIEEDK 
LGI PTVPGKVTLQKDAQNL IGIS IGGGAQYCPCLYI VQVFDNTP 
AALDGTVAAGDE I TGVNGRS IKGKTKVEVAKMIQEVKGEVTIHY 
NKLQADPKQGMSLDIVLKKVKHRLVENMSSGTADALGLSRAILC 
NDGLVKRLEELERTAELYKGMTEHTKNLLRAFYELSQTHRAFGD 
VFSVIGVREPQPAASEAFVKFADAHRSIEKFGIRLLKTIKPMLT 
DLNTYLNKAIPDTRLTIKKYLDVKFEYLSYCLKVKEMDDEEYSC 
I ALGE PL YRVSTGNYE YRL X LRCRQEARARFS QMRKDVLE KMEL 
LDQKHVQDIVFQLQRLVSTMSKYYNDCYAVLRDADVFPIEVDLA 
HTTLAYGLNQEEFTDGEEEEEEEDTAAGEPSRDTRGAAGPLDKG 
GSWCDS 


5596 


698 


219 


GAVLAPS SLPAAELAAQGE S QSLEDLSNTSR PTSE V YK I S F I FP 
NGDKYDGDCTRlTSSGIYERNGIGIHTTPNGIVYTGSWKDDKMNG 
FGRLEHFSGAVYEGQFKDNMFHGLGTYTFPNGAKYTGNFNENRV 
KGEGEYTHIQGTRMDWTFHFTSCSQT 


5597 


3 


731 


I S CKMAADGQS SLPASWRS VTLTHVE YPAGDLSGHLLAYLSLS P 
VFVIVGFVTLIIFKRELHTISFLGGLALNEGVNWLIKNVIQEPR 
PCGGPHTAVGTKYGMPSSHSQFMWFFSVYSFLFLYLRMHQTNNA 
RFLDLLWRHVLSLGLLAVAFLVSYSRVYLLYHTWSQVLYGGIAG 
GLMAIAWFIFTQEVLTPLFPRIAAWPVSEFFLIRDTSLIPNVLW 
FEYTVTRAEARNRQRKLGTKLQ 


5598 


326 


2440 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAL 
VPLLGSGVPPHPPAPSPCCSGQTMLKMLSFKLLLLAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGFYPRLSCCLRSDSPGLGRLENKIFSVTNNTECGKLLEE 
IKCALCSPHSQSLFHSPEREVLERDLVLPLLCKDYCKEFFYTCR 
GHIPGFLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEEYDKVEE1SRKHKHNCFCIQEWSGLRQPVGALHSGDGSQR 
LFILEJCEG YVK I LTPEGE I FKEP YLDIHKLVQSG I KGGDERGLL 
SLAFHPNYKKNGKLYVSYTTNQERWAIGPHDHILRWEYTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLYI ILGDGM 
ITLDDMEEMDCLSDFTGSVLRLDVDTDMCNVPYSIPRSNPHFNS 
TNQPP E VFAHGLHDPGRCAVDRHPTD IN I NLT ILCSDSNGKNR S 
SARI LQ 1 1 KGKDYE SEPSLLEFKPFSNG PLVGGFVYRG CQSERL 
YGSYVFGDRNGNFLTLQQSPVTKQWQEKPLCLGTSGSCRGYFSG 
HILGFGEDELGEVYILSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRATVQ P AQTLT S E C S RLCRWG YCTPTG KCCCS PG WEGDFCRTG 


5599 


326 


2440 


GIGPIAASFIFCKVASLYIFLSPPPPSVSGVPYSPANSSWSCAL 
VPLLGSGVP PHP PAPS PCCSGQTMLKMLS FKLLLLAVALGFFEG 
DAKFGERNEGSGARRRRCLNGNPPKRLKRRDRRMMSQLELLSGG 
EMLCGGFYPRLSCCLRSDSPGLGRLENKIFSVTNNTECGKLLEE 
I KCALCS PHSQSLFHS PERE VLERDLVLPLLCKDYCKEFFYTCR 
GHIPGFLQTTADEFCFYYARKDGGLCFPDFPRKQVRGPASNYLD 
QMEEYDKVEEISRKHKHNCFCIQEWSGLRQPVGALHSGDGSQR 
LF I LEKEGYVKILTPEGE I FKEP YLDIHKLVQSGI KGGDERGLL 
S LAFHPNYKKNGKLYVS YTTNQERWAIGPHDHI LRWE YTVSRK 
NPHQVDLRTARVFLEVAELHRKHLGGQLLFGPDGFLY I ILGDGM 
I TLDDMEEMDGLSDFTGS VLRLDVDTDMCNVP YS I PRSNPHFNS 
TNQPPEVFAHGLHDPGRCAVDRHPTDININLTILCSDSNGKNRS | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepti3e — 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P^Proline, Q«Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan f Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SARILQIIKGKDYESEPSLLEFKPFSNGPLVGGFVYRGCQSERL 
YGSYVFGDRNGNFLTLQQSPVTKQWQEKPLCLGTSGSCRGYFSG 
H1LGFGEDELGEVYILSSSKSMTQTHNGKLYKIVDPKRPLMPEE 
CRATVQPAQTLTSECSRLCRNGYCTPTGKCCCSPGWEGDFCRTG 


5600 


1977 


1244 


SLRVLSGHLMQTRDLVQPDKPASPKFIVTLDGVPSPPGYMSDQE 
EDMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 
SNAEMSELSVAQKPEKLLERCKYWPACKNGDECAYHHPISPCKA 
FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 
AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 
TFYHPTINVPPRHALKWIRPQTSE 


5601 


1977 


1244 


SLRVLSGHLMQTRDLVQPDKPASPKFIVTLDGVPSPPGYMSDQE"" 

EDMCFEGMKPVNQTAASNKGLRGLLHPQQLHLLSRQLEDPNGSF 

SNAEM S E hS VAQ K P EKLLE RCK YW P ACKNGDE CA YHHP I S P C KA 

FPNCKFAEKCLFVHPNCKYDAKCTKPDCPFTHVSRRIPVLSPKP 

AVAPPAPPSSSQLCRYFPACKKMECPFYHPKHCRFNTQCTRPDC 

TFYHPTINVPPRHALKWIRPQTSE 


5602 


246 


766 


YHTSCTA/WRTAKEALENTEVPVGCLMVYNNEWGKGRNEVNQTK 
NATRHAE M VA I DQ VLDVJ CRQ S G KS P S E VFEHT V L YVTVE P C I MC 
AAALRLMKIPLWYGCQNERFGGCGSVLNIASADLPNTGRPFQC 
I PG YRAE EA VEM L KTFYKQENPNAP KS KVRKKE CQQ I LNMF 


\ 5603 


1 


565 


FRGRTPISGGERGCAQYPI PATPARSGENRTMPGAGDGGKAPAR ' 
WLGTGLLGLFLLPVTLSLEVSVGKATDIYAVNGTEILLPCTFSS 
CFGFEDLHFRWTYNS SDAFKI LIEGTVKNEKSDPKVTLKDDDR I 
T L VGS TKE KRNN I S I VLRDL E FS DTG K YT CHVKN P KENNLQHHA " 
T I FLQWDRRMQ 


5604 


1 


1506 


EDIFPAQLLKLQRHERVWQQEPPVRDHRSWGGSGAGGVAGREWT " 
DQGQVALGGHYMAEGEGYFAMSEDEIiACSPYIPLGGDFGGGDFG 
GGDFGGGDFGGGDFGGGGSFGGHCLDYCESPTAwrNVT wwmvn 

RLLX3ILSETIPIHGRGNFPTLELQPSLIVKWRRRLAEKRIGVR 
D VRLNGS AASHVLHQDSGLG YKDLDL I FCADLRGEG E FQTVKDV 
VLDCLLDFLPEGVNKEKITPLTLKEAYVQKMVKVCNDSDRWSLI 
SLSNNSGKNVELKFVDSLRRQFEFSVDSFQIKLDSLLLFYECSE 
NPMTETFHPTI IGES VYGDFQEAFDHLCNKI I ATRNPEEIRGGG 
LLKYCNLLVRGFRPASDEIKTLQRYMCSRFFIDFSDIGEQQRKL 
E S YLQNHFVG LE DRK YE YLMTLHG WNE S T VCLMGHERRQTLNL 
ITMIAIRVLAIX2OTIPNVANVTCYYQPAPYVADANFSNYYIAQV 
QPVFTCQQQTYSTWLPCN 


5605 


35 


1821 


SQRSCPRSPSSPAPPWARCSNPDSRTGGVPVPRAWSAGGPALGL 
MAAPVRLGRKRPLPACPNPLFVRWLTEWRDEATRSRHRTRFVFQ 
KALRSLRRYPLPLRSGKEAKILQHFGDGLCRMLDERLQRHRTSG 
GDHAPDSPSGENSPAPQGRLAEVQDSSMPVPAQPKAGGSGSYWP 
ARHS GAR V I L L VL YRE HLNPNGHH FLTKEE LLQRCAQ KS P R VAP 
G S ARP WPALR S LLHRNLVLRTHQPAR YS LTPEGLELAQ KLAES E 
GLSLLNVGIGPKEPPGEETAVPGAASAELASEAGVQQQPLELRP 
GEYRVLLCVDIGETRGGGHRPELLRELQRLHVTHTVRKLHVGDF 
VWVAQETN P RD P AN PG E LVLDH I VERKRIiDDLCS S 1 1 DGR FREQ 
KFRLKRCGLERRVYLVEEHGSVHNLSLPESTLLQAVTNTQVIDG 
FFVKRTAD I KESAAYLALLTRGLQRLYQGHTLRSRPWGTPGNPE 
SGAMTSPNPLCSLLTFSDFNAGAIKNKAQSVREVFARQLMQVRG 
VS G E KAAAL VDR YS T P AS L LAA YDACAT P KEQETL LS T I KCGRL 
QRNLGPALSRTLSQLYCSYGPLT 


5606 


3 


1099 


GRSRCPGPGARGGTMSPRSCLRSLRLLVFAVFSAAASNWLYLAK 
LSSVGSISEEETCEKLKG.LIQRQVQMCKRNLEVMDSVRRGAQLA 
I E E CQ YQ FRNRR WNCS TLDS L P VFG KWTQGTREAAFVYA I S S A 
GVAFAVTRACSSGELEKCGCDRTVHGVSPQGFQWSGCSDNIAYG 
VAFS QS F VDVRERS KGAS S S RALMNLHNNEAGRKAI LTHMRVE C 
KCHGVSGSCEVKTCWRAVPPFRQVGHALKEKFDGATEVEPRRVG 
SSRALVPRNAQFKPHTDEDLVYLEPSPDFCBQDMRSGVLGTRGR 
TCNKTSKAIDGCELLCCGRGFHTAQVELAERCSCKFHWCCFVKC 
RQCQRLVELHTCR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C-Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F» Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P= Proline, Q=Glut amine, R-Arginine, 
S=Serine, T=Threonine, V«Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


5607 


521 


141 


PPVCNPAEAMPSPGTVCSLLLLGMLWLDLAMAGSSFLSPEHQRV 
QQRKESKKPPAKLQPRALAGWLRPEDGGQAEGAEDELEVRFNAP 
FDVGIKLSGVQYQQHSQALGKFLQDILWEEAKEAPADK 


5608 


2 


983 


WFQSPLRQADPGPPRHTLFMDFVAGAIGGVCGDAVGYPLDTVKV 
R I QTE PKYTG I WHC VRDT YHRERVWGFYRGLLLPVCTVS LVS S E 
VFGTYRHCLAHICRLRFGNPDAKPTKADITLSGCASGLVRVFLT 
SPTEVAKVRLQTQTQAQKQQRRLSASGPIiAVPPMCPVPPACPEP 
KYRGPLHCLATVAREEGLCGLYKGSSALVLRDGHSFATYFLSYA 
VL C E W LS P AGHS RP D VPG VL VAGG CAG VLAWAVATPMDV I KS RL 
QADGQGQRRYRGLLHCMVTIVREEGPRVLFKGLVLNCCRAFPVN 
MWFVAYEAVLRLARGLLT 


5609 


1628 


304 


AKGVWVLPSPPPRPGRGALVSGSGLRRGRSGTSWRPRRMNHKSK 
KR I REAKRSAR PELKDS LD WTRHN Y YES FS LS PAAVADNVERAD 
ALQLSVEEFVERYERPYKPWLLNAQEGWSAQEKWTLERLKRKY 
RNQ KF KCG E DN DG YS VKM KMKY Y I E YME S TRDD S PL Y I FD S S YG 
EHPKRRKLLEDYKVPKFFTDDLFQYAGEKRRPPYRWFVMGPPRS 
GTGIHIDPLGTSAWNALVQGHKRWCLFPTSTPRELIKVTRDEGG 
NQQDEAITWFNVIYPRTQLPTWPPEFKPLEILQKPGETVFVPGG 
W WHWLNLDTT I AITQN FAS STN F P WWH KTVRGR PKLS R KW Y R 
ILKQEHPEIAVLADSVDLQESTGIASDSSSDSSSSSSSSSSDSD 
SECESGSEGDGTVHRRKKRRTCSMVGNGDTTSQDDCVSKERSSS 
R 


5610 


54 


1196 


LERT P AS ADMAWT KYQL FLAGIiML VTG S INTLS AKWADN FMAEG 
CGGSKEHSFQHPFLQAVGMFLGEFSCLAAFYLLRCRAAGQSDSS 
VDPQQPFNPLLFLPPALCDMTGTSLMYVALNMTSASSFQMLRGA 
VI I FTGLFS VAFLGRRLVLSQWLG I LATIAGL WVGLADLLS KH 
DSQHKLSE VI TGDLLI I MAQ I I VAIQMVLEEKFVYKHNVH PLRA 
VGTEGLFGFVILSLLLVPMYYIPAGSFSGNPRGTLEDALDAFCQ 
VGQQPLIAVALLGNISSIAFFNFAGISVTKELSATTRMVLDSLR 
T WI WALSLALGWE AFHALQ I LGFL I LL IGTAL YNGLHRPLLGR 
LSRGRPLAEESEQERLLGGTRTPINDAS 


5611 


2 


577 


FVLPNRLGIPGSTFRGPGACASSSSLAASAKPGAGGSPALAMSG 
ELSNRFQGGKAFGLLKARQERRLAEINREFLCDQKYSDEENLPE 
KLTAFKEKYMEFDLNNEGE I DLMSLKRMMEKLGVPKTHLEMKKM 
ISEVTGGVSDTISYRDFVNMMLGKRSAVLKLVMMFEGKANESSP 
KPVGPPPERDIASLP 


5612 


1 


721 


ASRDGYMDATIAPHRIPPEMPQYGEENHIFELMQAMWLCKHLNS 
SLLTLENLILNEFSYTATEARRLYLQRKTVPSALLVQLIQERLA 
EEDCIKQGWILDGIPETREQALRIQTLGITPRHVIVLSAPDTVL 
IERNLGKRIDPQTGEIYHTTFDWPPESEIQNRLMVPEDISELET 
AQKLLEYHRNIVRVIPSYPKILKVISADQPCVDVFYQALTYVQS 
NHRTNAP FTPRVLLLGPVGS 


5613 


115 


1279 


RGVDPALRRAEKMLPLSIKDDEYKPPKFNLFGKISGWFRSILSD 
KTSRNLFFFLCIiNLSFAFVELLYGIWSNCLGLISDSFHMFFDST 
AI LAGLAAS VIS KWRDNDAFS YGYVRAEVLAGFVNGLFLIFTAF 
FI FSEGVERALAPPDVHHERLLLVSILGFWNLIG I FVFKHGGH 
GHSHGSGHGHSHSLFNGALDQAHGHVDHCHSHEVKHGAAHSHDH 
AHGHGHFHSHDGPSLKETTGPSRQILQGVFLHILADTLGSIGVI 
ASAIMMQNFGLMIADPICSILIAILIWSVIPLLRESVGILMQR 
TP P LLENS LPQ C YQ R VQQLQG V YS LQEQHF WTLCS D VYVGTL KL 
I VAPDADARW ILSQTHNI FTQAG VRQLYVQ I DFAAM 


5614 


3 


1268 


LLSRNEHACPLQAGLGLTQRKPKAIRGREGRATNQGQGETQNER 
APWGARQRLG VMAELQQLQE FE I PTGREALRGNHSALLRVAD YC 
EDNYVQATDKRKALEETMAFTTQALASVAYQVGNLAGHTLRMLD 
LQGAALRQ VEARVSTLGQMVNMHMEKVARRE IGTLATVQRLPPG 
QKVIAPENLPPLTPYCRRPLNFGCLDDIGHGIKDLSTQLSRTGT 
LS R KS I KAP ATPAS ATLiG RP P R I P E P VHL P WP DGRLS AAS S AS 
S LAS AG S AEGVGGAPTP KGQAAP PAP PLPSSLDPPPP PAAVE V F 
QRPPTLEELSPPPPDEELPLPLDLPPPPPLDGDELGLPPPPPGF 
GPDEPSWVPASYLEKWTLYPYTSQKDNELSFSEGTVICVTRRY 
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SEQ 
ID 
NO: 


rlcQl LLcU 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, OGlutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown , *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SDGWCEGVSSEGTGFFPGNYVEPSC 


S615 


9 


1558 


ALGRRRPGDPREMEAAATPAAAGAARREELDMDVMRPLINEQNF 
DGTSDEEHEQELLPVQKHYQLDDQEGISFVQTLMHLLKGNIGTG 

T .T .f~2 X ,DT, Zl T VATS/" 1 Tin PDTOT VDT/^ T T C t mouiTTrT t m nnt^nv - 

Jj JjVj Li h'i-LH J. JUMAl? ± V lAa F J. bl» V C X G 1 1 S VHCMH I LVRCSHFLCLR 
FKKSTLGYSDTVSFAMEVSPWSCLQKQAAWGRSWDFFIiVITQL 
GFCSVYIVFLAENVKQVHEGFLESKVFISNSTNSSNPCERRSVD 
LRIYMLCFLPFIILLVFIRELKNLFVLSFLANVSMAVSLVIIYQ 
YWRNMPDPHNLPIVAGWKKYPLFFGTAVFAFEGIGVVLPLENQ 
MKE S KR F P QALN IGMG I VTTL YVTLATLG YM CFHDE I KG S I TLN 

jjryuvnu iyoVMbiat'bJ.r V J. I£j±yr x V PAi* 1 II PGI TSKFHT 
KWKQ I CE FG I RS FL VS I TCAGA I L I PRLD I V I S FVG AVS S STLA 
LILPPLVEILTFSKEHYNIWMVLKNISIAFTGWGFLLGTYITV 
EEIIYPTPKWAGTPQSPFLNLNSTCLTSGLK 


5616 


1 


719 


DDFVRCGPQSAAMGASARLLRAVIMGAPGSGKGTVSSRITTHFE 
L KHL SSGDLLRDNMLRGTE IGVLAKAF I DQGKL I PDDVMTRLAL 
HELKNLTQYSWLLDGFPRTLPQAEALDRAYQIDTVINLNVPFEV 
IKQRLTARWIHPASGRVYNIEFNPPKTVGIDDLTGEPLIQREDD 
KPETVIKRLKAYEDQTKPVLEYYQKKGVLETFSGTETNKIWPYV 
YAFLQTKVPQRSQKASVTP 


5617 


176 


765 


PWRGRGSRPRGAGAMAEEQVNRSAGLAPDCEASATAETTVSSVG 
TCEAAGKSPEPKDYDSTCVFCRIAGRQDPGTELLHCENEDLICF 
KD I K PAATHH YL W P K KH I GNCRTLR KDQVE L VENMVT VG KTI L 
ERNNFTDFTNVRMGFHMPPFCSISHLHLHVLAPVDQLGFLSKLV 
YRVNSYWFITADHLIEKLRT 


5618 

• 


3 


1692 


YLNYINLKSENKLSGKEDLWEKLQYLWKSTLNLPEDLLRVPDES 
L FLNSGGDS LKS I RLLS E I EKLVGTS VPGLLE 1 1 LSSS 1 LE I YN 
HILQTWPDEDVTFRKSCATKRKLSNINQEEASGTSLHQKAIMT 
FT CHNE I NAF WL S RGSQILSLNS TRFLTKLGH CS SAC P SVSVS 
QTN I QNLKG LNS P VL I G KS KDPS C VAKVS EEG KP AI GTQ KME LH 
VRWRSDTGKCVDASPLWIPTFDKSSTTVYIGSHSHRMKAVDFY 
SGKVKWEQILGDRIESSACVSKCGNFIVVGCYNGLVYVLKSNSG 
EKYWMFTTEDAVKS S ATMDPTTGL I Y I GSHDQHAYALD I YR KKC 
VWKSKCGGTVFSSPCLNLIPHHLYFATLGGLLLAVNPATGNVIW 
KHSCGKPLFSSPQCCSQYICIGCVLX5NLLCFTHFGEQVWQFSTS 
GP 1 FS S PCTS PSEQKI FFGSHDCFI YCCNMKGHLQWKFETTSRV 
YAT PFAFHNYNGSNEMLLAAAS TDGKVW I LESQSGQLQS VYELP 
GE VFS S P WL E S M L 1 1 G CRDN YVY CLDLLGGNQK 


5619 


2160 


1477 


DSPVLPTSGNVISTAQPAQPWSAVEAALRSLGSPPGAGRGCPCP 
AQS LHS HQLAAWD PLKPSLRS Y PPHLLQHPQLRS LTAS SGKLGR 
KbLFOFKt , ijliEL»LRAGSSTRPQPLTSSCCGMSCMYSFLGHCSVL 
LWGTKGRGSGSPSSPGCCLHPPAQHSQDLPLVHVDVGWQPPLGP 
TVGLRPGLLGERQRGALRAGDPQCQCPLPATVREDLGVPSPWAA 
ECSPPATP 


5*20 


930 


182 


PLPPPTLAMFLTRSEYDRGVNTFSPEGRLFQVEYAIEAIKLGST • 
AIGIQTSEGVCLAVEKRITSPLMEPSSIEKIVEIDAHIGCAMSG 
L I ADAKTLI DKARVETQNHWFT YNETMTVESVTQAVSNLALQFG 
EE DAD P GAMS R P FG VALL FGG VDEKG PQ L FHMDP SGTFVQCDAR 
AIGSASEGAQSSLQEVYHKSMTLKEAIKSSLIILKQVMEEKLNA 
TNIELATVQPGQNFHMFTKEELEEVIKDI 


5621 


3 


819 


WEFVEYTATDANVKWESLSSVQQLGIKMTVRYGKFLSLLKDGA 
ENDLT WVLKHCER FliKOOOT *5 T K .T .CT .nfiKrv arzunu pijcct it 

MIMLGDKEKTFQFLHQFSRLLTSAFLWLPRLHISSYLPNDTVES 
GIHPVYFCSTHYIEMLLKAELPLVFSAFHMSGFAPSQICLQWIT 
QCFWNYLDWIE1CHYIATCVFLGPDYQVYICIAVFKHLQQDILQ 
HTQTQDLQ VFLKE EALHGFR VS D YFE YME I LEQNYRT VLLRDMR 
NIRLQST 


5622 


1122 


456 


AASTKDAVSRKRSHSASEKSGTGTSISKRLNMNPQIRNPMKAMY 
PGTFYFQFKNLWEANDRNETWLCFTVEG I KRRS WSWKTGVFRN 
QVDSETHCHAERCFLSWFCDDILSPNTKYQVTWYTSWSPCPDCA 
GEVAEFLARHSNVNLTIFTARLYYFQYPCYQEGLRSLSQEGVAV 
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SEQ 
ID 
NO: 


Predict eel 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


riculCLcQ enu 

nucleotide 
1 oca t ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AsAlanine, C=Cysteine / D=Aspartic Acid, E= 
wj-ucarnic aciu, r — Fnenyi alanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








oirjuniurM^wciWrv i.NUiMc.i'r KPWKGLtKTNF RLLKRRLRESL 
Q 


5623 


3 


954 


FLPFFIRAPKISRNGQWL.FTFTTPFPFANKALPGWEGIVPACFW 
R KK I LT PS TGTM E LLQ VT ILFLLPSICSSNS TG VLEAANNS LW 
TTTKPSITTPNTESLQKNWTPTTGTTPKGTITNELLKMSLMST 
ATFLTSKDEGLKATTTDVRKNDSIISNVTVTSVTLPNAVSTLQS 
aiu^ia j.vodIK.1 IblPGSVIiQPDASPSKTGTIjTSIPVTIPENT 
SQSQVIGTEGGKNASTSATSRSYSSIILPWIALIVITLSVFVL 
VG L YRM C W KAD PGTPENGNDQ PQS DKE S VKLLT VKT I S HE S G EH 
SAQGKTKN 


5624 


159 


898 


PGVAAAAGALPQYHGPAPALVSCRRELSLSAGSLQLERKRRDFT 
SSGS RKLYFDTHALVCLLEDNGFATQQAE I I VSALVKILEANMD 
IVYKDMVTKMQQEITFQQVMSQIANVKKDMIILEKSEFSALRAE 
NEK I KLELHQLKQQVMDEVI KVRTDTKLDFNLEKSRVKELYSLN 
EKKLLELRTEIVALHAQQDRALTQTDRKIETEVAGLKTMLESHK 
LDNI KYLAGS I FTCLTVALGFYRLW I 


5625 


1 


1180 


TIPSSAAAQRAGPPAGALEALSPGGARAHAERRGEMRATPLAAP 
AGSLSRKKRLELDDNLDTERPVQKRARSGPQPRLPPCLLPLSPP 
TAPDRATAVATASRLGPYVLLEPEEGGRAYQALHCPTGTEYTCR 
VYPVQEALAVLEPYARLPPHKHVARPTEVLAGTQLLYAFFTRTH 
GDMHSLVRSRHRI PEPEAAVLFRQMATALAHCHQHGLVLRDLKL 
CR F VF ADRER KKL VLENLE DS C VLTG PDDS LWD KHAC P A YVG P E 
ILSSRASYSGKAADVWSLGVALFTMLAGHYPFQDSEPVLLFGKI 
RRGAYALPAGLSAPARCLVRCLLRRE PAERLTATGI LLHPWLRQ 
DPMPLAPTRSHLWEAAQWPDGLGLDEAREEEGDREWLYG 


5626 


3123 


2011 


PPRALGSVAMENQVLTPHVYWAQRHRELYLRVELSDVQNPAISI 
TENVLHFKAQGHGAKGDNVYEFHLEFLDLVKPEPVYKLTQRQVN 
ITVQKKVSQWWERLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EEERLNKLRLESEGSPETLTNLRKGYLFMYNLVQFLGFSWIFVN 
LTVRFC I LGKE S FYDTFHTVADMMYFCQMLAWET I NAAIG VTT 
SPVLPSLIQLLGRNFILFIIFGTMEEMQNKAWFFVFYLWSAIE 
I FRYS FYMLTC I DMDWKVLTWLRYTLWI PLYPLGCLAEAVS VIQ 
SIPIFNETGRFSFTLPYPVKIKVRFSFFLQIYLIMIFLGLYINF 
KHLYKQRRRRYGQKKKKIH 


5627 


3123 


2011 


PPRALGSVAMENQVLTPHVYWAQRHRELYLRVELSDVQNPAI S I 
TENVLHFKAQGHGAKGDNVYEFHLEFLDLVKPEPVYKLTQRQVN 
ITVQKKVSQWWERLTKQEKRPLFLAPDFDRWLDESDAEMELRAK 
EEERLNKLRLESEGSPETLTNLRKGYLFMYNLVQFLGFSWIFVN 
LTVRFC I LGKESFYDTFHTVADMMYFCQMLiAWETINAAIGVTT 
S PVLPSLIQLLGRNFI LFI IFGTMEEMQNKAWFFVFYLWSAIE 
IFRYSFYMLTCIDMDWKVLTWLRYTLWIPIiYPLGCLAEAVSVIQ 
SIPIFNETGRFSFTLPYPVKIKVRFSFFLQIYLIMIFLGLYINF 
RHLYKQRRRRYGQKKKKIH 


; 5628 


75 


1455 


VAGAMASKCLKAGFSSGSLKSPGGASGGSTRVSAMYSSSPCKLP 
SLS PVARS FSACSVGLGRSS YRATSCLPALCLPAGGFATS YSGG 
GGW FGEG I LTGNEKETMQS LNDRLAG YLEKVRQLEQENAS LES R 
IREWCEQQVPYMCPDYQSYFRTIEELQKKTLCSKAENARLWEI 
DNAKLAADD FR TKYBTE VS LRQ L VE S D I NG LRR I LDDLTL CKS D 
LEAQVESLKEELLCLKKNHEEEVNSLRCQLGDRLNVEVDAAPPV 
DLNRVLEEMRCQYETLVENNRRDAEDWLDTQSEELNQQWSSSE 
GLOSCOAEI IELRRTVNALEIEL.OAOH^MRDAT.fqtt bjtt'E'adv 

SSQIiAQMQCMITNVEAQLAEIRADLERQNQEYQVLLDVRARLEC 
EINTYRGLLESEDSKLPCNPCAPDYSPSKSCLPCLPAASCGPSA 
ARTNCSARP I CVPCPGGRF 


5629 


2287 


938 


GR PRS S SDNRN FLRE RAGLS S AAVQ TR I GNS AAS RRS PAAR P PV 
PAP P ALPRGR PGTEGS TS LS APAVL WAVAWWWSAVAW AMA 
NYIHVPPGSPEVPKLNVTVQDQEEHRCREGALSLLQHLRPHWDP 
QEVTLQLFTDG ITNKL I GCYVGNTMED WLVRI YGNKTELLVDR 
DEEVKSFRVLQAHGCAPQLYCTFNNGLCYEFIQGEALDPKHVCN 
PAI FRLI ARQLAKIHAIHAHNGWI PKSNLWLKMGK YFSL IPTGF 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=»Asparagine , 
P=Proline, Q=Glutamine, R<=Arginine, 
S=Serine, ToThreonine, V=Valine, 
W=Tryptophan, Y==Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADEDINKRFLSDI PSSQ 1 LQEEMTWMKE I L S NLG S P WLCHNDL 
LCKNI I YNEKQGDVQ FID YE YSGYNYLA YD IGNHFNE FAG VSDV 
DYSLYPDRELQSQWLRAYLEAYKEFKGFGTEVTEKEVEILFIQV 
NQFALASHFFWGLWALIQAKYSTIEFDFLGYAIVRFNQYFKMKP 
EVTALKVPE 


5630 


1194 


278 


GFWAIAQTCAHHLPPGSPWLVPASPWRLPEMSSFGYRTLTVAl^ - ' 

TLICCPGSDEKVFEVHVRPKIdiAVEPKGSLEVNCSTTCNQPEVG 

GLETSLDKILLDEQAQWKHYLVSNISHDTVLQCHFTCSGKQESM 

NSNVSVYQPPRQVILTLQPTLVAVGKSFTIECRVPTVEPLDSLT 

LFLFRGNETLHYETFGKAAPAPQEATATFNSTADREDGHRNFSC 

LAVLDLMSRGGNIFHKHSAPKMLEIYEPVSDSQMVIIVTWSVL 

LSLFVTSVLLCFIFGQHLRQQRMGTYGVRAAWRRLPQAFRP 


5631 


1053 


290 


SRVDDFVRPEPSRAEPSRSGRRRPARRAATMSVFGKLFGAGGGK 
AGKGGPTPQEAIQRLRDTEEMLSKKQEFLEKKIEQELTAAKKHG 
TKNKRAALQALKRKKRYEKQLAQIDGTLSTIEFQREALENANTN 
TEVLKNMGYAAKAMKAAHDNMDIDKVDEIJ^QDIADQQEIAEEIS 
TAISKPVGFGEEFDEDELMAELEELEQEELDKNLLEISGPETVP 
LPNVPS I ALPSKPAKKKEEEDDDMKELENWAGSM 


5632 


3 


952 


WLGW S P PRRLW WG S LG AAQ R P AVP VS GLAR S LH VETRR PHR 

SVRVARGRLGVWAQPQPLLPRPVGSRREMQPPGPPPAYAPTNGD 

FTFVSSADAEDLSGSIASPDVKLNLGGDFIKESTATTFLRQRGY 

GWLLEVEDDDPEDNKPLLEELDIDLKDIYYKIRCVLMPMPSLGF 

NRQWRDNPDFWGPLAWLFFSMISLYGQFRWSWIITIWIFGS 

LTIFLLARVLGGEVAYGQVLGVIGYSLLPLIVIAPVLLWGSFE 

WSTLIKLFGVFWAAYSAASLLVGEEFKTKKPLLIYPIFLLYIY 

FLSLYTGV 


5633 


771 


460 


QGCSKTMSVGRPFYRSSEFMEQLLSSHLHQVPFFCCFTWCLCN"* 

CLFENSVSKLYMLCFNFFMSIFFYSLSITKLNLIYLWGLSYQSL 

LLLLLSGHRPWGSSMV 


5634 


1446 


855 


PRATGRIRSRAAASRPRAGAGASGAEPRSGRERSRLSGRRAPAM 
ARNTLSSRFRRVDIDEFDENKFVDEQEEAAAAAAEPGPDPSEVD 
GLLRQGDMLRAFHAALRNSPVNTKNQAVKERAQGWLKVLTNFK 
SSEIEQAVQSLDRNGVDLLMKYIYKGFEKPTENSSAVLLQWHEK 
ALAVGGLGS I IRVLTARKTV 


5635 


3 


• 943 


DRGPRSTATDTGRARVS FWRFPLDPGVKNSNVQ I SGEKRRFRTL 
RS LFH P F P VTRSGAPRAVLVG S S W PAKMVAPAVKVARG WSGLAL 
GVRRAVLQLPGLTQVRWSRYSPEFKDPLIDKEYYRKPVEELTEE 
E KYVRELKKTQLI KAAPAGKTSS VFEDPVI S KFTNMMM IGGNKV 
LARSLMIQTLEAVKRKQFEKYHAASAEEQATIERNPYTIFHQAL 
KNCEPMIGLVPILKGGRFYQVPVPLPDRRRRFLAMKWMITECRD 
KKHQRTLM P E KLSHKLLEAFHNQGP VI KR KHDLHKMA E ANRALA 
HYRWW 


5636 


2253 


1143 


LEDTI CQHP PAEKKLYLYHRKIiREVERNG I PRLPKD VFMDTHQG 
LTDVRAKVTGFSEGWDSVKGGFSSFSQATHSAAGAWSKPRBI 
ASLIRNKFGSADNIPNLKDSLEEGQVDDAGKALGVISNFQSSPK 
YGSEEDCSSATSGSVGANSTTGGIAVGASSSKTNTLDMQSSGFD 
ALLHEIQEIRETQARLEESFETLKEHYQRDYSLIMQTLQEERYR 
CERLEEQLNDLTELHQNEILNLKQELASMEEKIAYQSYERARDI 
QEALEACQTRISKMELQQQQQQWQLEGLENATARNLLGKLINI 
LIAVMAVLLVFVSTVANCWPLMKTRNRTFSTLFLVVFIAFLWK 
HWDALFS YVERFFSS PR 


5637 


948 


2532 


MS FCGARANAKMMAAYNGGTSAAAAGHHHHHHHHLPHLP P PHLH 
HHHHPQHHLHPG S AAAVH P VQQHTS S AAAAAAAAAAAAAM LNP G 
QQQPYFPSPAPGQAPGPAAAAPAQVQAAAAATVKAHHHQHSHHP 
QQQLDIEPDRPIGYGAFGWWSVTDPRDGKRVALKKMPNVFQNL 
VSCKRVFRELKMLCFFKHDNVLSALDILQPPHIDYFEBIYWTE 
LMQSDLHKIIVSPQPLSSDHVKVFLYQILRGLKYLHSAGILHRD 
IKPGNLLVNSNCVLKICDFGLARVEELDESRHMTQEWTQYYRA 
PEILMGSRHYSNAIDIWSVGCIFAELLGRRILFQAQSPIQQLDL 
ITDLLGTPSLEAMRTACEGAKAHILRGPHKQPSLPVLYTLSSQA 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, CeCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I«Isoleucine, K-Lysine, 
L=Leucine, M«Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine / T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








THEAVHLLCRMLVFDPYKRISAKDALAHPYLDEGRLRYHTCMCK 
CCFSTSTGRVYTSDFEPVTNPKFDDTFEKNLSSVRQVKEIIHQF 
ILEQQKGNRVPLCINPQSAAFKSFISSTVAQPSEMPPSPLVWE 


5638 


125 


1155 


DRKMSELDQLRQEAEQLKNQIRDARKACADATLSQITNNIDPVG 
R I QMRTRRT LRGHLAK I YAMH WGTD S RLL V S AS QDGKL 1 1 WDS Y 
TTNKVHAI PLRS S WVMTCAYAP SGN YVACGGLDNI CS I YNLKTR 
EGNVRVSRELAGHTGYLSCCRFLDDNQIVTSSGDTTCALWDIET 
GQQTTTFTGHTGDVMSLSLAPDTRLFVSGACDASAKLWDVREGM 
CRQTFTGHES D I NAI CF F PNGNAFATGSDDATCRLFDLRADQEL 
MTYSHDNI I CGI TS VS FS KSGRLLLAG YDDFNCNVWDALKADRA 
GVLAGHDNRVSCLGVTDDGMAVATGSWDSFLKIWN 


5639 


125- 


1155 


DRKMSELDQLRQEAEQLKNQIRDARKACADATLSQITNNIDPVG 
RIQMRTRRTLRGHLAKIYAMHWGTDSRLLVSASQDGKLIIWDSY 
TTNKVHAI P L RS S W VM TCA YAPSGN Y VACGG LDN I CS I YNL KTR 
EGNVRVSRELAGHTGYLSCCRFLDDNQIVTSSGDTTCALWDIET 
GQQTTTFTGHTGDVMSLSLAPDTRLFVSGACDASAKLWDVREGM 
CRQTFTGHESDINAICFFPNGNAFATGSDDATCRLFDLRADQEL 
MTYSHDNI ICG ITS VSFSKSGRLLLAG YDDFNCNVWDALKADRA 
GVIiAGHDNRVS CLG VTDDGMAVATGS WDS FLKIWN 


5640 


280 


1092 


QQGNKKTMLSHNTMMKQRKQQATAIMKEVHGNDVDGMDLGKKVS 
IPRDIMLEELSHLSNRGARLFKMRQRRSDKYTFENFQYQSRAQI 
NHSIAMQNGKVDGSNLEGGSQQAPLTPPNTPDPRSPPNPDNIAP 
GYSGPLKEIPPEKFNTTAVPKYYQSPWEQAISNDPELLEALYPK 
LFKPEGKAELPDYRSFNRVATPFGGFEKASRMVKFKVPDFELLL 
LTDPRFMSFVNPLSGRRSFNRTPKGWISENIPIVITTEPTDDTT 
VPESEDL 


5641 


27 


332 


CRHNCNGD V KLL S NQMDKL FAFHL FT FHGLLH FLDG S I Q KL I QA 
EIILSDNSSILVLENNFLFKVKSKQFIHLIAKKFYISITIVSAS 
NGESFVLSMIVTG 


5642 


199 


1247 


ITPCRMDFLVLFLFYLASVLMGLVLICVCSKTHSLKGLARGGAQ 
I FS C I I P E CLQRAMHG LLH Y L FHTRNHT F I VLH L VLQGMVYTE Y 
TWEVFGYCQELELSLHYLLLPYLLLGVNLFFFTLTCGTNPGIIT 
KANELLFLHVYE FDEVMFPKN VRCSTCDLRKPARS KHCS VCNWC 
VHRFDHHCVWVNNCIGAWNIRYFLIYVLTLTASAATVAIVSTTF 
LVHLWMSDLYQETYIDDLGHLHVMDTVFLIQYLFLTFPRIVFM 
LG F VWLS FLLGG YL L FVL YLAATNQTTN E WYRGD W AWCQR C PL 
VAWPPSAEPQVHRNIHSHGLRSNLQEIFLPAFPCHERKKQE 


5643 


1 


847 


PSGGVRDVETRGPGSRAARGPRWMHRRGVGAGAIAKKKLAEAK 
YKERGTVLAEDQLAQMSKQLDMFKTNLEEFASKHKQEIRKNPEF 
R VQFQDM CAT IGVDPLASGKG FWSEMIX3 VGDFY YELGVQI I E VC 
LALKHRNGGLITLEELHQQVLKGRGKFAQDVSQDDLIRAIKKLK 
ALGTG FG 1 1 P VGG T YL I QS VP AE LNMDHTWLQ LAE KNG YVT VS 
E I KAS L KW E TE RARQ VLEH LL KEGLAWLDLQAPG EAHYWLP ALF 
TDLYSQEITAEEAREALP 


5644 


83 


1138 


PRRMGS W VQL I TS VGVQQNH PG WT VAGQFQE KKR FTE E VI E YFQ 
KKVS P VHL KI LLTSDE AWKR F VR VAEL P RE E ADALY E ALKNLT P 
YVAIEDKDMQQKEQQFREWFLKEFPQIRWKIQESIERLRVIANE 
IEKVHRGCVIANWSGSTGILSVIGVMLAPFTAGLSLSITAAGV 
GLGI AS ATAG I AS S I VENT YTRS AELTASRLTATSTDQLEALRD 
ILHDITPNVLSFALDFDEATKMIANDVHTLRRSKATVGRPLIAW 
RYVPINWETLRTRGAPTR I VRKVARNLGKATSGVLWLDWNL 
VQDSLDLHKGEKSESAELLRQWAQELEENLNELTHIHOSLKAG 


5645 


537 


799 . 


VQSVRDLKRLSPTDPPGDSGNRDVTREDPVTGPLNSASSQVPTL 
YL C LQNS LLGHS S VED ARATME L YQ I S QR I RARRGLPR LAVS D 


5646 


3745 


3328 


AEQYGTSPHLLPTMLLSSCLPPANVTTKAATPPPLVLSLTTADP 
AGKPAPCRVTLTLLRAS I PATKRAS FLSS FI KMFFEELEYILGF 
LSLLKFHVHVSVYSAICHFQKEGTGNSRSFTCTPELFPRLQTHL 
RAEGGAQ 


5647 


288 


800 


GVIMATSELSCEVSEENCERREAFWAEWKDLTLSTRPEEGCSLH 
E EDTQRHET YHQQGQCQVLVQRS PWLMMRMG I LGRGLQE YQLP Y 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G»Glycine, 
H^Histidine, I^Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N*»Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QRVLPLP1FTPAKMGATKEEREDTPIQLQELLALETALGGQCVD" 
RQEVAEITKQLPPWPVSKPGALRRSLSRSMSQEAQRG 


5648 


7 


1518 


VLSELCGRHEALREVGAEWPPPTCSPNICSGLQQAGNTDWSLTM"' 

AP Q S L PS S RMAPLGMLLGLLMAAC FT F CLS HQNL KE FALTN PE K 

SSTKETERKETKAEEE LDAE VLE VFH P THE WQALQ PG QAVPAGS 

HVRLNLQTGEREAKLQYEDKFRNNLKGKRLDINTNTYTSQDLKS 

ALAKFKEGAEMESS KEDKARQAEVKRLFRP I EELKKDFDELNW 

IETDMQIMVRLINKFNSSSSSLEEKIAALFDLEYYVHQMDNAQD 

LLSFGGLQWINGLNSTEPLVKEYAAFVLGAAFSSNPKVQVEAI 

EGGALQKLLVILATEQPliTAKKKVLFALCSLLRHFPYAQRQFLK 

LGGLQVLRTLVQEKGTEVLAVRWTLLYDLVTEKMFAEEEAELT 

QEMSPEKLQQYRQVHLLPGLWEQGWCEITVHLLALPEHDAREKV 

LQTLGVLLTTCRDRYRQDPQLGRTLASLQAEYQVIjASLELQDGE 

DEGYFQELLGSVNSLLKELR 


5649 


1172 


3006 


MLQEQLDAINEEIRMIQEEKESTELRAEEIETRVTSGSMEALNL 

kqlrkrgsiptsltdlslasaspplsgrstpkltsrsaaqdldr 
mgvmtlpsdlrkhrrkllspvsreenredkatikcetsppsspr 
tlrleklghpalsqeegksaledqgsntpsssnssqdslhkgakr 
kgikss igrl fgkkekgrl i qlsrdgatch vlltdsefsmqe pm 

VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTWSWL 
ELWVGMPAWYVAACRANVKSGAIMSALSDTEIQREIGISNALHR 
tiKLRLAIQEMVSLTSPSAPPTSRTSSGNVWVTHEEMETLETSTK 
TDSEEGSWAQTLAYGDMNHEWIGNEWLPSLGLPQYRSYFMECLV 
DARMLDHLTKKDLRVHLKMVDSFHRTSLQYGIMCLKRLNYDRKE 
LEKRREESQHE I KDVLVWTNDQWH WVQS IGLRD YAGNLHESGV 
HGALLALDENFDHNTLALILQIPTQNTQARQVMEREFNNLLALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


; 5650 


1172 


3006 


MLQEQLDAlNEEIRMIQEEKESTELRAEEIETRVTSGSMEALNLi 
KQLRKRGS I PTS LTDLSLAS AS PPLSGRSTPKLTSRS AAQDLDR 
MGVMTLPSDLRKHRRKLLSPVSREENREDKATIKCETSPPSSPR 
TL RLE KLG H PAL S QE EGKS AL EDQGSN P S S SNSSQDS LHKGAKR 
KGIKSSIGRLFGKKEKGRLIQLSRDGATGHVLLTDSEFSMQEPM 
VPAKLGTQAEKDRRLKKKHQLLEDARRKGMPFAQWDGPTWSWL 
ELWVGM PAW YVAACRANVKSGAI MS ALS DTE IQRE I G I SNALHR 
LKLRLAI QEMVSLTS PS APPTS RTSSGNVWVTHEEMETLETSTK 
TDSEEGSWAQTLAYGDMNHEWIGNEWLPSLGLPQYRSYFMECLV 
DARMLDHLTKKDLRVHLKMVDSFHRTSLQYGIMCLKRLNYDRKE 
L E KRRE E S QHE I KD VLVWTND Q WHWVQS I G LRD YAGNLHES G V 
HGALLALDENFDHNTLALILQI PTQNTQARQVMEREFNNLLALG 
TDRKLDDGDDKVFRRAPSWRKRFRPREHHGRGGMLSASAETLPA 
GFRVSTLGTLQPPPAPPKKIMPEAHSHYLYGHMLSAFRD 


5651 


646 


1869 


ARQGQRQPWG*EARAKGPASESPRV*EGSGWEGPASP*TPGSTL " 
AWG EG AG I R * AS GLTAAGAAS AAAA/ P P PTRGG P APAG CGRAP P 
WPAPLRVPTHGRAPAPRSRAAPRAPALSHGTAAAALSPASPAGP 
ADP * LPGHS SQS PPRG * RWGRSRS APAPAHPEHPAPAGSAS ASQ 
QTPGWPGSCCLAQGWQAEPLGAPGAEDG\PVPPQRGFPLGTLGS 
PAGS WAGLAG YG * AGAPGTQATAPRAAGQTPVAAAPNCRV+ GSA 
PALHRAPAAADPGSPLQAPPRAWASPAAAGPGLSSSDYCGGLGA 
G WRAG I S PELLGAAGLSDNWARCPG PG PAE * GGQ PGCR TI P ASA 
CMPS P PVEGSLGLSRKGHGDLPSQAR * GWHE CRRARHLVPLPRL 
LGPRGRTGRPSS PS 


5652 
" 5*53 


735 


343 


HH KK YQH IHQ KS FS C P E P ACGKS FNF KKHLKEHMKLHSDTRD Y I 
CE FCARS FRTS S NLV I HRR I HTG E KPLQC E I CG FT CRQKAS LNW 
HQRKHAETVAALRFPCEFCGKRFEKPDSVAAHRSKSHPALLLA 




66 


1401 


RGRLQSRGRbTLGLVLLLLblLGARQH6QRVSHGWKGGFLTAPL ' 
CF PQP CQ PGTRRGRRRS LKE ATE PQ LAMAE E FVTLKD VGMD FTL 
GDWEQLGLEQGDTFWDTALDNCQDLFLLDPPRPNLTSHPDGSED 
LEPLAGGS PEATS PDVTETKNSPLMEDFFEEGFSQEI/SRDVIQ 
GWLLELQFRRSLYRGHLVR+FARRSRKSSEV*YCHQRGKSHGMQ 
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(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
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W-Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ES * I KERTQS CVHRFHGRRFHG \DNVSE KTLTPAKSKEYRGEFF 
^xz>uni3\j\juz> vuc.tjrbKpygcSECGKSFSGSYRLTQHWITHTREK 
PTVHQECEQGFDRKASHSGYPKTHTGYKFYVCNEYGTPFSQSTY 
LWHQKTHAGEKPCKSQDSDHPPSHDTQSGEHQKTHTDSKSYNCN 
ECGKAFTRIFHLTRHQKIHTRKRYECSKCQATFNLRKHLIQHQK 
THAANV 


5654 


3 


598 


TLPLFPGRR FRG WRR CGAVAAR KNS TGGNVS I NQRRDS VRMS AL 
N W KP FVYGG LAS I TA R CGT FP I D LTKTR FQ I QGQTNDAKFKE 1 1 
x KIjMijH/Uj V KI bKEEGIiKAiiYSG * VGLHAFLCHCSLFHMG IDFR 
PRLHRSQVKSLRCV*KEQIA**/MFSLLISTLISKYIYYAADVL 
EKLFYYIQVQTDNNKKICLFKNI 


5655 


2 


867 


RPPGIRAPRQLHPAAGRRPDASARPRFRPTVLLHDPFQLSFPPP 
PLSYPSVFPAVARVLPQRSGDYRAAGMPQLSGGGGGGGGDPELC 
ATDEMIPFKDEGDPQ\REKIFAEIVNPEEEGDLADIKSSLVNES 
EI I PASNGHE VARQAQTSQEPYHDKAREHPDDGKHPDGGLYNKG 
PSYSSYSGYIMMPNMNNDPYMSNGSLSPPIPRTSNKVPWQPSH 
AVHPLTPLITYSDEHFSPGSHPSHIPSDVNSKQGMSRHPPAPDI 
PTFYPLSPGGGGQITPPLGWQGQP 


5656 


228 


1066 


PRRVPPLPEFASGPGAAFFHSGRLQRSLTKDSAGCFSQCRSRAM"* 
LVLRSGIiTKALASRTLAPQVCSSFATGPRQYDGTFYEFRTYYLK 
PSNMNAFMENL KKN I HLR TS YSELVG FWS VEFGGRTNK V FH I W K 
YDNFPHRAEVRKALANCKEWQEQS I I PNLARIDKQETEITYLI P 
WSKLQKPPKEGVYELAVFQMKPGGPALWGDAFERAINAHVNLGY 
TKWG VFHTE YGELNRVHVLWWNESADS RAAVRHKS HEDP I S WG 
GVRESVNYL\VSQQNM 


5657 


105 


1052 


GQRLQSPRVQMPVQPPSKDTEEMEAEGDSAAEMNGEEEESEEER 
SGSQTESEEESSEMDDEDYERRRSECVSEMLDLEKQFSELKEKL 
FRERLSQLRLRLEEVGAERAPEYTEPLGGLQRSLKIRIQVAGIY 
KGFCLDVIRNKYECELQGAKQHLESEKLLLYDTLQGELQERIQR 
LEEDRQSLDLSSEWWDDKLHARGSSRSWDSLPPSKRKKAPLVSG 
PYIVYMLQEIDILEDWTAIKKARAAVSPQKRKSD\DLDPAVHSQ 
GDPQSSWHCTQDSRLPPADRRTHRPLRVCPARLLWCCWALPLHL 
ALVWTPPL 


5658 




3541 


TERRVYNPWPEPDPD\CIQEDPWNLPNSIKTLVDNIQRYVEDGK 
NQLLLALLKCTDTELQLRRDAIFCQALVAAVCTFSEQLLAALGY 
RYNNNGEYEESSRDASRKWLEQVAATGVLLHCQSLLSPATVKEE 
RTMLED I WVTLS ELDNVTFS FKQLDEN YVANTNVF YH I EGS RQA 
LKVIFYLDSYHFSKLPSRLEGGASLRLHTALFTKVLENVEGLPS 
PGSQAAEDLQQDINAQSLEKVQQYYRKLRAFYLERSNLPTDAST 
TAVKI DQL IRP INALDELCRLMKSF VHPKPGAAGS VGAGL I P I S 
SELCYRLGACQMVMCGTGMQRSTLSVSLEQAAILARSHGLLPKC 
I MQATD I MRKQGPR VE I LAKNLRVKDQM PQGAPRL YRLCQPKMN 
GDL 


5659 


2 


696 


WKRSGEVS PKGELGAWRGNSGRPKI IGRAAEAENEDRTLGRLLP 
GNERSQPRS PLRLLAPQLKAEAAADKGLAPVPPPFS SGHSGPC\ 
EREGEGQRGRGRSRRGAHLELKPSPGLRAGAPTDRGRGGPABVA 
AAGGRRMVQKESQATLEERESELSSNPAASAGASLEPPAAPAPG 
EDNPAGAGG \ AAVAG AAGGARR FLCG WEG FYGR P WVMEQR KE L 
FRRLQKWELNTYL 


5660 


229 


853 


PVTMWAFSELPMPLLINLIVSLLGFVATVTLIPAFRGHFIAARL 
CGQDLNKTSRQQIPESQGVISGAVFLIILFCFIPFPFLNCFVKE 
QRKAFPHHEFVALIGALLAI CCMIFLGFADDVLNLRWRHKLLLP 
TAAS LPLLM VYFTNFGNTT I WP KP FRP I LGLHLDLGR * S YHC C 
PYGT YFREPFLVLHI LLQVFLFCIiCVFPDP FW 


5661 


2 


473 


LNLYPSPCGGIPKLPGLPREAAAALGASFLAEAPLPVTVRGSGL ' 
AGMAVTCD P KAFLS I CFVTLVFLQLPLAS ICQN*GTDSCASRGK 
ADFDVTGPHAPILAMAGGHVELQCQLFPNISAEDMELRWYRCQP 
S LAVHMH ERGM DMD G E Q KWQ YRGRT 


5662 


2 


1318 


LRKEGRCRRGSNRGVWAAPAEGLGGRGMLGVRCLLRSVRFCSSA 
PFPKMKPSAKLSVRDALGAQNASGERIKIQGWIRSVRSQKEVLF 
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Codon, /^possible nucleotide deletion, 
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LHVNDGSSLESLQWADSGLDSRELTFGSSVEVQGQLIKSPSKR 
QNVELKAEKIKVIGNCDAKDFPIKYKERHPLEYLRQYPHFRCRT 
NVLGSILRIRSEATAAIHSFFKDSGFVHIHTPIITSNDSEGAGE 
LFQLEPSGKLKVPEENFFNVPAFLTVSGQLHLEVMSGAFTQVFT 
FGPTFRAENSQSRRHLAEFYMIEAEISFVDSLQDLMQVIEELFK 
ATTMMvLSKCPEDVE3jCHKFIAPGQKDRL*HMLKNNFLIISYTE 
AVE I LKQASQNFTFTPEWGADLRTEHEKYLVKHCGNI PVFVINY 
PIjTLKPFYMRDNEDGPQEIjEGSVA*HSLGLMILLSIWIGQP 


5663 


119 


698 


PADIGRSTAKTPGPPRSLEMDDPRYGMCPLKGASGCPGAERSLL 
VQSYFEKGPLTFRDVAIEFSLEEWQCLDSAQQGLYRKVMLENYR 
NLVFLGIALTKPDLITCLEQGKEPWNIKRHEMVAKPPVICSHFP 
QDLWAEQD I KDS FQEAI LKKYGKYGHANFQLQKGCKSVDECKVH 
KEHDNKLNQCLI PKKKK 


Db b 4 


118 


572 


S LS ME S NH KS G DGLSGTQ KE AAL RALVQRTG YS L VQENGQR KYG " 
GPPPGWDAAPPERGCEIFIGKLPRDLFEDELIPLCEKIGKIYEM 
RMMMDFNGNNRGYAFVTFSNKVEAKNAIKQLNNYEIRNGRLLGV 
CASVDNCRLFVGGIPKTKK 


5665 


347 


702 


WQHLIILLHCERTSPAMITSELPVLQDSTNETTAHSDAGSELE 
ETEVKG KRKRGR PGRPP S TNKKPRKS PGEKSRI EAG IRGAGRGR 
ANGHPQQNGEGEPVTLFEWKLGKSAMQRC 


5666 


213 


540 


VSCLPTSCKMITLNNQDQPVPFNSSHPDEYKIAALVFYSCIFII 
GLFVNITALWVFSCTTKKRTTVTIYMMNVALVDLIFIMTLPFRM 
FY YAKD E WP FGE Y FCQ I LG A 




1 


695 


HPLPSASLGLPSVSLGVSLCVRSALLEAWPMLPKRRRARVGSP 
SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 
VLDACSSEATHVVMEETSAEEAVSWQERRMAAAPPGCTPPALLD 
ISWLTESLGAGQPVPVECRHRLEVAGPSKGPLSPAWMPAYACQR 
PT PLTH HNTGL S E ALE I LAE AAG FEG S EGRLLT FCRAAS VLKAL 
PSPVTTLSQLQ 


566B 


691 


894 


CS FLFC I PDLFLQFLLGRKEEEAVLVGGEWS PSLDGLDPQADPQ 
VLVRTA I RCAQAQTG I DLSGCTKW 


"" 5669 


407 


1 


DSGAPEGLSPLMSTQEGLSMHAHPQAYTPFIYLHARKRRGEIGD™ 
ADSRFNDRYAHKSAQLYFLYFVCWIFQDVYYFTIKEKNHFFFPK 
ARGAPTKYSGS P IGS PTTTPPTRPPS FNLHPAPHLLASMQLQKL 
NSQ 


5670 


3 


373 


SSECLTMAWIPLLLPLLILCTVSVA^YEtiAQPSSVSVSPGQTAK 
I TCS GD VLAKK YARW FQQ K PGQAPVLV I YKDTE RPSGIPERFSG 
S TSGTTVTLTI SdAQVEDEAD Y FCYS ATDNFLWVF 


JO / I 


280 


524 


KFPPKKTPPHLGMESAITLWQFLLQLLLDQKHEHLICWTSNDGE " 
FKLLKAKKVAKLWGLRKNKTNMNYDKLSRALRLLFMT 


5672 


2 


557 


FVPATPDPGVWLPPSRDPAMAKRSSLYIRIVEGKNLPAKDITGS 
SDPYCIVKVDNEPIIRTATVWKTLCPFWGEEYQVHLPPTFHAVA 
FYVMDEDALSRDDVIGKVCLTRDTIASHPKGKFSLPSHTGLPSP 
WPPSHSETSPLGSVWSPAQGKPFLLSPEAGATFCTPGLCSAACS 
QAWLLLPLP 


5673 


327 


696 


ITVADQISHWSAGRIKNRTRIPECIHSSAATTLAGPHTMEGESV 
KLSSQTLIQAGDDEKNQRTITVNPAHMGKAFKVMNELRSKQLLC 
DVMIVAEDVEI EAHRWLAACS PYFCAMFTGDMS 


5674 


17 j 


984 


GGGSMEGESTSAVLSGFVLGALAFQHLNTDSDTEGFLLGEVKGE' 
AKNSITDSQMDDVEWYTIDIQKYIPCYQLFSFYNSSGEVNEQA 
LKKILSNVKKNWGWYKFRRHSDQIMTFRERLLHKNLQEHFSNQ 
DL VFL LLTP3IITESCS THRL EHS L YK P Q KGLFHR VP L WANLG 
MSEQLGYKTVSGSCMSTGFSRAVQTHSSKFFEEDGSLKEVHKIN 
EM YAS LQEELKS I CKKVEDS EQAVDKLVKDVNRLKRE I EKRRGA 
QIQAAREKNIQKDPQENIFLCQALRTFFPNSEFLHSCVMSLKID 
MFLKVAVTTTTISM 


5*75 


BO 


753 


EGSRRGPTRLARLSARAGRLHFPPGFSSRLIHFRGVSECRRPPG 
KSGVPVSAPGSDGKWWEERPGMFSLMASCCGWFKRWREPVRKVT 
LLMVGLDNAGKTATAKGIQGEYPEDVAPTVGFSKIWLRQGKFEV 
T I FDLGGG I R I RG I VJKNYYAES YG VI FWDS SDEERMEETKEAM 
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SEMLRHPRISGKPILVLANKQDKEGALGEADVIECLSLEKLVNE 
HKCL 


5676 


2 


930 


FVSSPPPRPVQPARPGGFGLSGRRSLLCQVASTPAHVGVMRSPV ' 
RDLARNDG E ES TDRT PLLPGAP RAE AA P VCCS AR YNLAI LAFFG 
FFIVYALRVNLSVALVDMVDSNTTLEDNRTSKACPEHSAPIKVH 
HNQTGKKYQWDAETQGWILGSFFYGYIITQIPGGYVASKIGGKM 
LLGFGILGTAVLTLFTP I AADLGVGPL I VLRALEGLGEGVTFPA 
MHAMWS S WAP P LER S KLLS I S YAGAQLG TVI SLPLSGII C YYMN 

WTYVFYFFGTIGIFWFLLWIWLVSDTPQKHKRISHYEKEYILSS 
L 


! 5677 


1 


1028 


PPRDGFLELRRLSVPLCSGPCPLTSLSRQGERSGGHLVAAARAA 
VTAETHPLPLLAPLAVCQSVKSPAACQVRPRPRAVALPAALGGP 
GRSLPGLTAATMSSFSESALEKKLSELSNSQQSVQTLSLWLIHH 
RKHAGPIVSWHRELRKAKSNRKT..TFLYLANDVIQNSKRKGPEF 
TREFESVLVDAFSHVAREADEGCKKPLERLLNIWQERSVYGGEF 
IQQLKLSMEDSKSPPPKATEEKKSLKRTFQQIQEEEDDDYPGSY • 
S PQDPSAGPLLTEELI KALQDLENAASGDATVRQKIASLPQEVQ 
DVSLLEKITDKEAAERLSKTVDEACLRNRGPGTS 


5678 


3 


593 


SSSPPSSTPSLPLPFYLLLGQLRLQLLWGTAHLSGAGEAAPCPG 
GSGRTAAPRTRADPAAQSLMIMNKTIKNFKRRFSLSVPRTETIEE 
SLAEFTEQFNQLHNRRNENLQLGPLGRDPPQECSTFSPTDSGEE 
PGQLSPGVQFQRRQNQRRFSMEVRASGALPRQVAGCTHKGVHRR 
AAALQ P DFD VS KRLS L PMD I 


5679 


2 


623 


LNSRVDDFVAVPGAIMDEDYYGSAAEWGDEADGGQQEDDSGEGE 
DDAE VQQECLHKFSTRD Y I MEPS I FNTLKR Y FQAGGS PENVIQL 
LS ENYTAVAQTVNLLAEWL I QTG VEP VQVQETVENHLKS LL I KH 
FDPRKADSIFTEEGETPAWLEQMIAHTTWRDLFYKLAEAHPDCL 
MLNFTVKVGR VL E LRR KVFMNVY FWLL VCFL 


5680 


258 


592 


RRLTSTSEKLQNRNSHTPLESLIHlPQPSYKGFGIMFGKKKKKIE 
ISGPSNFEHRVHTGFDPQEQKFTGLPQQWHSLLADTANRPKPMV 
DP S C I T P IQLAPMKT I VRGNKP C 


5681 


45 


869 


LLCAKT LGVRT KE S QAE G YNRSG I NNHQ AED PR FC PS F CWMRS A 
RQTRPQRLRKEAARPPTPGSCPGGTGMDGKKCSVWMFLPLVFTL 
FTSAGLWIVYFIAVEDDKILPLNSAERKPGVKHAPYISIAGDDP 
PAS CVFS QVMNMAAFLAL WAVLRF I QLKPKVLNPWLN I SGLVA 
I.iCLA S FG MT L LGN FQLTNDE E IHNVGTS LT FG FGTLTC W I Q AAL 

TLKVNIKNEGRRVGIPRVILSASITLCVGPLLHPHGPKHPHVCS 
QGPVGPGHVL 


5682 


39 


622 


PSRSCLGTMRKWRHREVNLPEVTQQDAVCPAPI PS PGLSAQTGL 
QKIWGTIHCQVCPGAPAWPGSPWHEEMGLLLLVPLLLLPGSYGL 
P F YNG F Y YSNS ANDQNLGNGHGKDLLNG VKL WET PEETL F T YQ 
GAS VI L P CRYRYE PALVS P RR VRVKW W KLSENG AP EKD VL VA I G 
LRHRS FGDYQGRVHLRQD 


5683 


89 


778 


GSCGATALITRCLAWSVLISRLAMATYTCITCRVAFRDADMQRA 
HYKTDWHR YNLRR KVASMAPVTAEG FQERVRAQRAVAEEES KGS 
ATYCTVCS KKFAS FNAYENHLKSRRHVELEKKAVQAVNRKVEMM 
NEKNLEKGLGVDS VDKDAMNAAIQQAI KAQPSMS P KKAPPAPAK 
EARNWAVGTGGRGTHDRD PS EKP PRLQWFEQQAKKIiAKHS EDD 
SEDEEHDLC 


5684 


195 


677 


TWCFRGYLGPRVIMKALDEPPYLTVGTDVSAKYRGAFCEAKIKT 
AKRLVKVKVTFRHDS STVE VQDDH I KG PLKVGAI VE VKNLDGAY 
QEAVINKLTDASWYTWFDDGDEKTLRRSSLCLKGERHFAESET 
LDQLPLTNPEHFGTPVIGKKTNRGRRYE 


5685 


779 


1262 


LLLQQPWHCFLLFPPFRFSHHMIPGPPGPHTTGIPHPAiVTPQ 
VKQEHPHTDSDLMHVKPQHEQRKEQEPKRPHIKKPLNAFMLYMK 
EMRANWAECTLKESAAINQILGRRWHALSREEQAKYYELARKE 
RQLHMQLYPGWSARDKTYVS PSS I PVALHS 


5£8£ 


128 


1181 


CTWWQWITLLDINDNHPTWKDAPYYINLVEMTPPDSDVTTVVA 
VDPDLGENGTLVYSIQPPNKFYSLNSTTGKIRTTHAMLDRENPD 
PHEAELMRKIWSVTDCGRPPLKATSSATVFVNLLDLNDNDPTF 
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Codon, /^possible nucleotide deletion, 
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QNLPFVA&VLE6I PAGVs I YQ WAIDLDEGLNGLVS YRMPVGMP 
RMDFLINSSSGVVVTTTELDRERIAEYQLRVVASDAGTPTKSST 
STLTIHVLDVNDETPTFFPAVYNVSVSEDVPR\GSGWSG*AARN 
NDVGLNAELSYFITGGNVDGKFSVGYRDAWRTWGLDRETTAA 
YMLI LEA I DNGP VGKRHTGTATVFVTVLDVNDKRPI ILQSS YV 


5687 


17 


917 


AAPPAPPDG/PPP/PPPAPPT/PGPAA/APASSCQPRLSAGRAA 
QGDGGAAAVGH VL WPAVGPVRVNPGLQTPVPRPELLPG P \ SSS 
LHSDSSYPPDAGLSDDEEPPDASLPPDPPPLTVP/ADA/PMPVT 
SGCRMPSTSASE/AAGGQGACTHAKGSETPP PAS PQTSE PAPS P 
LP PHLTGG PGM YS SEAKLPNS FSCLG1»AGTGAG I * GTAS AHGTG 
PPVLPHVCTPSLANPQP\AVGPEASSLPLGVSGIGMSA/SAPIS 
SSPFVAIGSCWLRGIPPPGSGFLCPGRAPGPVPITTHGQEGQGP 
VLDI 


5688 


1 


420 


LTKWDLFGNC YRLLKTGI EHGAMPEQVGVYWYS / CLYDSRKLFF 
*SHMI IRSLL+KVIDDSLGQLPLLRELLL* *LNVIDRCIILAYV 
LRVEKTFAITYLKNFTVKVDFSLLGEIPLISMAAILKLWIMKID 
DGYIPAVF 


5689 


1504 


3 


HELSGKHISMVSGNTCNWHPGGHSPGGGGQGBITSKDRGEIPAL 
IWA/RKPIGTWTATKPTHRAG*GGAEEYQPPPQPCEGPRSTSRG 
GEG*GHAVGPGREIGKEGSLPFLGPKALGF*SASCQRAFEGGAH 
GSTARKPAPATPGTRHPRTMETREVAQGWPAGPRSQFWDQHPHS 
PGEHRPSG\SPLPACPPRAWPKAGAVASATGTG\PQLPGSRGKQ 
KLPRTREPPLLQAGWAVRKPPWSEAKEGLGQAGRPSGMDSSAS\ 
PQTPGGRGSLEWGLPLYLGPHHDVK*RSDRLG*PP*GGQGGGGH 
GAPSTPGPGGEAW*LPQQTSRPKPGPQAY+GE\GSPGLQCPCSK 
EL*RVPPGSLGPSTQCMYEPTDKHS\GGADAQLEVSTAGSRSTF 
GQELKGPLDAGRLWPGAPSASSSHR*GG*ERARAGAGHRGST*A 
SSKIEQGRPRPGPTSDALADVEGGAES/GPHPWPLPGTLPNR/P 
GS PPPA* ASAGKKGTVSTLGGGLL 


5690 


1424 


58 


PSPPAGVCAAPAPLPLLALARRDRRPCSPGAEAAPWQTGGPAID " 
GAWRTSVSALRRGATG/APCSPGAEAAPWQTGGPAIDG\DGELP 
*VRSEEAPRGCGAEGGGPGSGPVRRPGAGRGAHAGQGRQQDPEP 
DGLRHRQHGAASHARHRLQRLRPGHHQNRHVRRDPQAPPGGPAP 
GHAAALPERTRGVAE P PAWAHAGSDAWRAGR* SQRT * ERAR PRH 
PTFQGRAGS\GQPGYQPPNPHPGPSSPPAAP\GPRGA*GNPQLE 
KAPRSDRNPSQGI.RTRIRRPETPDOGPPSPAGSSASASTFRCTS 
SLSLLG P / PGAHNLDTAPQDR * HGP * GDKRGAPGVAGEDPRPP * 
GNFVR*LLLMP/GVA*RHGTSPFLGPSLGENGGQWDSGNLFGTP 
KG * SH PA FT KST * S ME AEKS Y WNH PHR \ DRGRQG VR I NC LRVG E 
SEMWGPYSAPRPGTVFLSSFLSPASEEH\PEGSSSFNTPFPPAG 
PEGDPGLNS PGLLP 


5691 


107 


550 


ISNDPSPGYNIEQMAKRGKKLVELPYTVKGMDVSFSGILSFIED " 
VAHRMLATGECTPEDLCFSLQVMQ*KTGTESWG*RFYIVEQN*S 
GDAPL I FS P YLSLTGNCGFAMLVE I TERAMAH\ CGS PGGPSLWG 
GVGVYVLLESVPLSYS 


5692 • 


1193 


548 


TQAWTRAEKDRKGSVRALRLHLERGPPT*RGSHPL\QSVPCIQK " 

PSIFSSYPI/GLPQSGGEPGPVGEQQPVRRPEQPSCGPASRMPL 

TSRSVPPGRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQ 

RLNLPVMGATRSNLQPPRKVAVPGPTR*RDQDSKQDFSSKPLQS 

VPGLASTQQTLTPADSGPGTGGRDATRAGLPGVETMGNGVD 




1258 


1330 


ALT WP VRKG TTW WAQ PHGCSNL VS RARLD LS S R PSQNTE PQAP 
*QAGPPSSLRPP\SRRR*APEWPKRATGSRCRGLSAPPWPWPAA 
RGE/PGSAPSHAP/PNSPRPSGTRHP/PGPSSRVLYSPSLPRNS 
P E A I VWR S S R FP LW F P LRCCFW VS GF KDPNP VLRFF 


5694 


3 


1338 


GSKEPARSLHRRGSGHKSSAGKWGSVTLSTAGALG*KQLHQ*WT " 

QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 

SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 

CDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAL 

KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 

IARPSTSGS FGYKKP PPATGTATVMQTGGSATLS KIQKSSGI PV 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 

amino aClU 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V~Valine, 
W-Tryptophan, Y«Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKEKAKAKAVALDSDNISLKSIGSPESTPKNQASH 
PTATKLAELPPTPLRATAKSFVKPPSLANLDKVNSNSLDLPSSS 
DTTQCI 


5695 


3 


1336 


GSKEPARSLHRRGSGHKSSAGKWGSVTLSTAGALG*KQLHQ*WT 
QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 
SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 
CDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAL 
KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 
IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 
KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 
VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 
PAPVNQTDREKEKAKAKAVALDSDNISLKSIGSPESTPKNQASH 
PTATKLAELPPTPLRATAKSFVKPPSLANLDKVNSNSLDLPSSS 
DTTQCI 


5696 


3 


1338 


GSKEPARSLHRRGSGHKSSAGKWGSVTLSTAGALG*KQLHQ*WT~ 

QRCL\NNLSSEEFNASSSLNSLPSTPTASRRNSTIVLRTDSEKR 

SLAESGLSWFSESEEKAPKKLEYDSGSLKMEPGTSKWRRERPES 

CDDSSKGGELKKPISLGHPGSLKKGKTPPVAVTSPITHTAQSAL 

KVAGKPEGKATDKGKLAVKNTGLQRSSSDAGRDRLSDAKKPPSG 

IARPSTSGSFGYKKPPPATGTATVMQTGGSATLSKIQKSSGIPV 

KPVNGRKTSLDVSNSAEPGFLAPGARSNIQYRSLPRPAKSSSMS 

VTGGRGGPRPVSSSIDPSLLSTKQGGLTPSRLKEPTKVASGRTT 

PAP VNQTDRE KE KAKAKAVALD S DN I S LKS I G S PE S TP KNQASH 

PTATKLAELPPTPLRATAKSFVKPPSLANLDKVNSNSLDLPSSS 
DTTQCI 


5697 


1147 


47 


PSEALSPPACPSAPAPRRSIISRLFGTSPATBAAPPPPEPVPAA 
QGPATVQSVEDFVPDDRLDRSFLEDTTPARDEKKVGAKAAQQDS 
DSDGEALGGNPMVAGFQDDVDLEDQPRGSPPLPAGPVPSQDITL 
SSEEEAEVAAPTKGPAPAPQQCSEPETKWSSIPASKPRRGTAPT 
RTAAPPWPGGVSVRTGPEKRSSTRPPAEMEPGKGEQASSSESDP 
EGPIAAQMLSFVMDDPDFESEGSDTQRRADDFPVRDDPSDVTDE 
DEGPAEPPPPPKLPLPAFRLKNDSDLFGLGLEEAGPKESSEEGK 
EGKTPSKENKKKKKKGKEEEEKAAKKKSKHKKSKDKEEGKEERR 
RRQQRPPRSRERTAA 


5698 


2 


666 


GAEAAEPQEDLPPLSQSSRFFQEQQKMNKSLGPVSFKDVAVDFT 
QEEWQQLDPEQKITYRDVMLENYSNLVSVGYHIIKPDVISKLEQ 
GEEPWIVEGEFLLQSYPDEVWQTDDLIERIQEEENKPSRQTVFI 
ETLI*R/ERGNVPGNTFDVETNPVPSRKIAYTHSLCNSCER\GF 

NASSEYISSDGRYARMKADECSGCGKSLLHIKLEKTHPGDQAYE 
FNQ 


5699 


2 


1448 


RVRQPPGLWVRRTVPAMQCPAGLSRVPGVAG/DPSLPSFRGPRD 
EAAHRGTIQTARHTRKLYVQGPASGPPLPRVSTQVAI*DEKPLA 
RPS/GRTNAPFPQGQKPAGKAAPGPAAAGRVAMR\PGHPGLLAS 
DSQRSSSKGSGWETPVPWS*AQPGWVSGLLLLGDPSGPGSL+RS 
TWLVGGARGPEGSGVRGSGWPSGCSDIGWALAGWNHS*HLDPNTT 
WTQKWTGE/SPAPGEEG\VAPAPRGPTAEHGHCELTTESQYSNN 
VPILFQNPSGALRSRRTEPAGWVPPTRHE*DDG*TAAPASGGAP 
VSTPTWAGTP/LNASLGPTDPQGKPGCRPPCALPKPAGPERSA* 
oooiwLft/ owijFAboUrF^Ar^PKKXjAAGAHTSASAKCPPAAAA 
GWQPRRPGFAGRAALPGPPHPPSS*RELGGLPGPGW*TLDPLPA 

HPAHPPGSAPPWGALGGWAAARASLPWSPSLCLSFPAVTPVAGL 
FPPGRG 


5700 


923 


597 


NGHKGVWEINIY*RRSNIHKNSKSESHLNQDHSFPPPTPNSARS 
KLHSTGTAKNTGLPLSGAPRQRAVFSGRTICQEFSSCLQCAYLD 
E*CSIASSLIKAILRVSVLSE 


5701 


59 


410 


IFEKICSDTQEFISPEINPQICSWL1FDK6AK/NHATGKDSLFN 
KWSWKNWLSTCR*MRPGPYFTPYTKINSK*IK/DANIRCETVKL 
LEENTGENLHDTGLGNVFLDMTPKTQPTKQK 
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Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionlne, N=Asparagine, 
P=Proline, Qt=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W«=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


1 5702 


. 3 


1517 

• 


ETFVDPSQCGGIPSDSPHPVITPSRASESSASSDGPHPVITPSR " 

ASESSASSDGPHPVITPSRASESSASSDGLHPVITPSRASESSA 

SSDGPHPVITPSRASESSASSDGPHPVITPSRASESSASSDGLH 

PVITPSRASESSASSDGPHPVITPSWSPGSDVTLLAEALVTVTN 

IEVINCSITEIETTTSSIPGASDTDLIPTEGVKASSTSDPPALP 

D S TEAKPH I TE VTAS AE TLSTAG TT ESAAPHATVGTPLP TNSAT 

EREVTAPGATTLSGALVTVSRNPLEETSALSVETPSYVKVSGAA 

PVSIEAGSAVGKTTSFAGSSASSYSPSEAALKNFTPSETLTMDI 

TTKGPFPTSRDPLPSVPPTTTNSSRGTNSTLAKITTSAKTTMKP 

PTATPTTARTRPTT\A*VQVKMEVSSSCG*VWLPRKTSLTPEWQ 

KG * CS S STGNS TPTRLTS RS P YCVSG EANG / PS AAARHVP YAKR 

GCCP*PGPPPTDCSCVTVLRGTQKVPMKGSMSKPLTPDVATGPS 

LTSTGVYWGGASPVPRGVLGLTLAHVLCFSKEKT 


5703 


14 


1117 


HHKDSRSQGLPRTQECARPELRPLLCPRALWPVTRLSYRCPWQA 
PKAGIGTKAKPSESHLKLHPGWPSLDRQGEPATLGTGTGHCSDS 
RILRWHP* HTAAR* PRWRRLPSSHRWTRHIjGVLRVQDKS * * VSL 
DPSCRPRFLRTC**YGMRSVASSSNPPPGWSGPGASVFPARPVS 
ALPTGPRCW*APRGRTRQPCGWPRLSSPHATADWGPGCPLSPSR 
GSWETAPGS+WCPWL*AARWTGWRTASGASAGLGRAADRPSAWA 
RRVAGLLPGQGLTVRR*H*TAGAPASVRSSQGATRSPAPGGDQC 
ACGRGPGSC*HPPPWPVSPSSPVPCPSGR*HLRGPIiLSAARPRA 
AGWPRHSPHDTQTPEP 


5704 


23 


562 


GDYEFDSPYWDDISQAAKDLVTRLMEVEQDQRITAEEAISHEWI 
SGNAASDKN I KDG VCAQ I E KNFARAKWKKAVR VTTLMKRL.RAPE 
QS S TAAAQS AS ATDTAT PG AAGGATAAAASGATS APEGDAARAA 
KSDNVAPRRP+LPPQPQMEVPPQPLMAVSPQPPMEASLQPLMGE 
SPQP 


5705 


23 


562 


GDYEFDSPYWDD I SQAAKDLVTRLMEVEQDQRITAEEAI SHEW I 
SGNAASDKN I KDGVCAQ IE KNFARAKWKKAVR VTTLMKRbRAPE 
QS S TAAAQSAS ATDTAT PG AAGGATAAAASGATS APEGDAARAA 
KSDNVAPRRP*LPPQPQMEVPPQPLMAVSPQPPMEASLQPLMGE 
SPQP 


5706 


1161 


610 


QLGRFXAQDTVAIRKVKEVFGTGAMRHWILFTHKED*GGQALD 
DYVANTDNCS LKDLVRE CERRYCAFNNWGS VEEQRQQQAELLAV 
IERLGREREGSFHSNDLFLDAQLLQRTGAGACQEDYRQYQAKVE 
WQVEKHKQELRENESNWAYKALLRVKHLMLLH YE I FVFLLLCS I 
LFFIIFLF 


5707 


28 


609 


GSPAPTPGPRRRPGRGTPSPGTRHHQGRAEPEPDAPERAPLRR+ 
MFAIQPGLAEGGQFLGDPPPGLCQPELQPDSNSNFMASAKDANE 
NWHGMPGRVE P ILRRSS SES PSDNQAFQAPGSPEEGVRS PPEGA 
E I PGAE P EKMGG AGTVC S P LE DNG YAS SSLSIDSRSSSPE PACG 
TPRGPGPPDPLLPSVAQA 


5708 


44 


1925 


SFSWEETISPCFPKMPAEPWWLSPVSLGAAGWPGQPRPYLDLPA 
QASVSRPHDRA*GEAVSLSLSSGDVCGHTDGGGAGSDPQAKPKP 
PRCPFTAMPSPRTKQKVRNKVCLLIAIRYSDIPSDVSKAP\GPA 
GNPHDRSSTAA*LHRRAGAGSLCLSASIiLPPSFSLGAPGAPSPL 
RVSPASGGPRKEGRQGSGG*AGGGGP\ARTHADLPCVGFVCSPP 
LLK*SDSPVKQLPA\SGQGSGAGMPPVGSSDILRPRPTSVSGTG 
RAAG*CSWQPAACCTPRSQ*WAVARSPSRCSRW*RQSGR*RG*S 
SRRRRGP*AAGRSTPAVP*PCS*GGAGRRAYACRTGWGYAPSR* 
LEPSGPTSGSAL+TWASHSTGA**SRLCGTAGTGPLCSQSSRS* 
AG*RCCCTAASPCGGSGPSHPGSPSAHCLSWSGGRTQPRAPSAH 
GRGRAMGSRCVCTCTGLPCPGIPLSGASPGGSGRTGAGRSHTLK 
AARSRLSPRPGSGSRGSY*SHNDNWGTWPAPPSAGHLLVGG*NS 
QRTSSDH*YTGTRRPWAGPGTRCSTAPSRAAPPVSRCRPPPPPP 
PPRPPRLPAAAS/SGGASGSPAASCSCSCRAPAKPASS/GEAPA 
PPPRPEPPPPPARRP 


5709 


2 


2031 


ITLCPLPQTEKCLNWTEAATPLGIYLKARVEAGGLKELEISWG 
LHQI WRWGAWMRAGMGGCRCWGVMAP FAPR/NALS FLVNDCS 
L I HNNVCMAAVF VDRAGE WKLGGLD YMYSAQGNGGG PPRKGI PE 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C«=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl al anine , G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y»Tyrosine, X-Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEQYDPPELADSSGRWREKRSADMWRLGCLIWEVFNGPIjPRAA 
ALRNPGKIPKTLVPHYCELVGANPKVRPNPARFLQNCRAPGGFM 
SNRFVETNLFLEEIQIKEPAEKQKFFQELSKSLDAFPEDFCRHK 
VLPQLLTAFEFGNAGAWLTPLFKVGKFLSAEEYQQKI I PVWK 
MFSSTDRAMRIRLLQQMEQFIQYLDEPTVNTQIFPHWHGFLDT 
NPAIREQTVKSMLLLAPKLNEANLNVELMKHFARLQAKDEQGPI 
RCNTTVCLGKIGSYLSASTRHRVLTSAFSRATRDPFAPSRVAGV 
LGFAATHNLYSMNDCAQKILPVLCGLTVDPEKSVRDQAFKAIRS. 
FLSKLESVSEDPTQLEEVEKDVHAASSPGMGGAAASWAGWAVTG 
VSSLTSKLIRSHPTTAPTETNIPQRPTPEGVPAPAPTPVPATPT 
TSGHWETQEEDKDTAEDSSTADRWDDEDWGSLEQEAESVLAQQD 
DWSTGGQVSRASQVS\TPTTNPPNPQSPTGAAGK\RGLLGTGLA 
GAKLPGATS * RYTAGQRV 


5710 


1 


562 


IPGSTISCEVELMARMAKTIDSFTQNQTRLVVIIDGIjDACEQDK 
VLQMLDTVRVLFSKGPFIAIFASDPHIIIKAINQNLNSVPSGFK 
\LNGHD YMRN I VHLPVFLNSRGL/RQ/ LQENFS * LQQQMETFHA 
QILQGYRKMLTEEFHRTALGR*QNLVARQPSIDG*DAIGFELYV 
CIAIQFNTNKDDAT 


5711 


1*26 


1130 


RRHPFQWTTVTQEAFSHHDVAFTSTPVLFYPDSAQPFIVKSESS 
SQIAKAVLSQQRPSLFHECAFHFPS * SLQRHTINLDQGI F* LLM 
LSEERQHLFE S S / 1 WTT PHNLK* / FE I HEHLGSHEGHWTLF FLL 
QIL ! 


5712 


3 


1391 


GRKLFQSLDISERLKFLLTLDCVDDTLIVLAEEHGCLDIIKELP 
ETVIDLLNKCLTFHPSKRPTPDELMKDKVFSEVSPLYTPFTKPA 
SLFSSSLRCADLTLPEDISQLCKDINNDYLAERSIEEVYYLWCL 
AGGDLE KE LVNKE IIRSKPPI CTLPNFLFEDGES FGQGRDRSS / 
TFR*YHWDIWMPAKK*IERCWGRSILPITLKMTSLILPYSNSN 
NELSAAATLPIiIIREKDTEYQLNRULFDRLLKAYPYKKNQIWK 
EARVD I P PLMRGLTWAALLGVEGAI HAKYDAIDKDTP I PTDRQ I 
EVDIPRCHQYDELLSSPEGHAKFRRVLKAWWSHPDLVYWQGLD 
SLCAPFLYLNFNNEALVYACMSAFI PKYLYNFFLKDNSHVI QEY 
LTVFSQMIAFHDPEIiSNHLNEIGFIPDLYAIPWFLTMFTHVFPL 
HKIFHLW\DTLLLGEFLFPILYWE 


5713 


<J34 


284 


PVCAVPVDRWPVLPREDQEGQQL*AKLPRDFRR*FQILGPMEGH 
TACRCSRRGAQVQIILPREDIRAAE*DPHLREVWPGLPTSSATSP 
+RAVLTSPCSHLGSADAASSHWLCGVSFH 


5714 


212 


613 


WGLGLGPTMSSLGGGSQDAGGSSSSSTNGSGGSGSSGPKAGAAD 
KS A WAAAAPAS VADDTP P P ERRNKSG 1 1 SEPLNKS LRRSRPLS 
HYSSFGSSGGSGGGSMMGGESADKATAAAAAASLLANGHDLAAA 
MA 


5715 


131 


1979 


ESASQQKRSKCLILTLKLELSGSAPKKTSARPGSSLWLPPHSQE" 

QTPPASKLQGGGGGLQTGWGLHPVPVTAASPLPRWCLFGAVAK\ 

GLPGP*LCPSGAA/GGLQRGPGLSPLGAAGKVSCLHPPSMVENN 

DSTCHEHHEGILAARVTPVP\SGKPGRVLKPPGRVCRPPHPAAS 

PRPPGS/SDLDGPRPQMHLRAFPAAHGGPVNTPHGGEEKTFMSS 

QIRRKETKPL*RKTPAG\NNYQSNSIPVSQSPQLTVDLLPSAGR 

TQAPSGRGDAGKPTPGHG\LPKASVILTPNCPCSLAGGQ*PPGL 

YPKTPKQRRWRRPL/LLGPSQ+GSRQSTC+EV\GALGEPVRIPG 

L*PDLSCILSNGSKHRREGLSFPRSLGPGRRGPAGLQSLGCSPT 

P KNTACHS S GHVALQAGHDS ARD VGSGHVALQAGHDS TQDVGR P 

v wkw x put* * LiGJjSKETGQATRRGLVWI S PGRAAAACVACAQALE 

EGPLRLPGQDRGAQPCSHCPGRAAGQPEPGAGAPCRE/GG*DPT 

GLT/GVPGTDPKRGGRKPGQSGQETQGPTVWSGPESPLQPKP*E 

RQE/VGAGASSGVGLSRGRAGGPSSAWEVAAMLLLLRHGSHSEL 

TDLTEAQTSQH 


5716 


1711 


1370 


RVFSLLCEGPGHCYQGAVCREACAAASPGLDSAAEPHRIjCEHTD 
*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYAC* 
RCPLVL* SGFFTI I VGGYSCCMPLKT 


5717 


44 


1489 


LPTEALRESEWVSEYGKCGPRGLVPEGESTSPLPSSVDTEDSLD 
EGPGALVLESDLLLGQDLEFEEEEEEEEGDGNSDQLMGFERDSE 
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Predicted end 
nucleotide 
location 
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to first 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=»Aspartic Acid, E« 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*=Lysine, 
L=Leucine, M=Methionine, N«Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y:=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GDSLGARPGLPYGLSDDESGGGRALSAESEVEEPARGPGEARGl"" 

RPGPACQLCGGPTGEGPCCGAGGPGGGPLLPPRLLYSCRLCTFV 

S H YS SHL KRHMQTHS G E KPFRCGR C P YAS AQL VNLTRHTRTHTG 

EKPYRCPHCPPACSSLGNLRRHQRTHAGPPTPPCPTCGFRCCTP 

RPARPPSPTEQEGAVPRRPEDALLLPDLSLHVPPGGASFLPDCG 

Q\CGVKGRASAGLDQNHCQS/SLFPWTCRGCGQELEEGEGSRLG 

AAMCGRCMRGEAGGGASGGPQGPSDKGFACSLCPFATHYPNHLA 

RHMKTHSGEKPFRCARCPYASAHLDNLKRHQRVHTGEKPYKCPL 

CPYACGNLANLKRHGRIHSGDKPFRCSLCNYSCNQSMNLIRHM 


5718 


120 


284 


VAHALSLPAESYGNDVSMTHPQLPPTQLAWDLCRTCLPLSYNFT 
S**STADPLHL 


5719 


48 


428 


E LNNG P FQM P LCNGGNLAVTG S WADRS PLHEAAS QG R LLALRTL 
LS QG YNVN AVTLDHVT P LHE A CLGDHVACARTLLE AGANVNAI T 
IDGVTPLFNACSQGSPSCAELLLEYGAKAQP\ESCLPSP 


5720 


1 


1051 


LQAFRNASEVPMVLVGTQDAISAA\NPRVYRRTSRARKLSTDLK 
\RCT\YYE\TCGGTYGLQMWSVSFQDVAQKWAIi\RKKQQ\LAI 
GPCK\SLPN\SPSH\SAVSAASIPARAPINQGHE/SGGGSAFSD 
Y\SSSVPSTPSISQRELRIETIAASSTPTP1RKQSKRRSNIFTS 
RKGADP\DREKKAAGCKVDSIGSGRAIPIKQGILLKRSGKSLNK 
EWKKKYVTLCDNGLLTYHPSIiHDYMQWIHGKE I DLLRTT VKVPG 
KRLPRATPATAPGTS PRANG LSVERSNTQLGGGTGAPHSASS AS 
LHSERPLSSSAWAGPRPEGLHQRSCSVSSADQWSEATTSLPPGM 
QHPASG 


5721 


97 


492 


RHSSPCCSLRRTERSSNAAVST/TTVQQFKRFIENYRRHIGCVA ' 
VFYAIAGGLFLERAYYYAFAAHHTGITDTTRVGI I LSRGTAAS I 
SFMFSYILLTMCRNLITFLRETFLNRYVPFDAAVDFHRLIASTA 


5722 


88 


1043 


VALDVLAGSSPGGGMAGALLGPRVHGIRAVLRVARGGVQAPGAP 
GS LG VS HAAAP PAR PQGAAQ S PHRG RRHGGGGAG LPPPRSPRFP 
QE S V P AS TS TARG P RR VS RRLPPQHPGPRGRRRR PGAGVG APRR 
GRARGQAGLLGRQGQGGRGAERERAALQARRGRRPGPEPDQSCG 
GR P RRAAAAPGRAPADP Q P P APR P AP APD VR P PADAPAP APAP A 
PPPPPHLGALTAGSGEERQSQPRAETLRLGRGAPLP\PRAERGG 
RP KQ AEQQQ \ P KR P TP PARG PQS S GD P AML PQRAGL RTGGLAGT 
KSSTREIPEMI 


5723 


88 


1043 


VALDVLAGSS PGGGMAGALLGPRVHC* I RAVLRVARGGVQAPGAP - 
GSLGVS HAAAP PAR PQGAAQSPHRGRRHGGGGAGLPPPRS PR FP 
QESVPASTSTARGPRRVSRRLPPQHPGPRGRRRRPGAGVGAPRR 
GRARGQAGLLGRQGQGGRGAERERAALQARRGRRPGPEPDQSCG 
GRPRRAAAAPGRAPADPQPPAPRPAPAPDVRP PADAPAP APAP A 
PPPPPHLGALTAGSGEERQSQPRAETLRLGRGAPLP\PRAERGG 
RPKQAEQQQ\PKRPTPPARGPQSSGDPAMLPQRAGLRTGGLAGT 
KSSTREIPEMI 


5724 


3 


1841 


FTNEAPPAPLPDASASPLSPHRRAKSLDRRSTEPSVTPDLLNFK 
KGWLTKQYEDGQWKKHWFALADQSLRYYRDSVAEEAADLDGEID 
LSACYDVTEYPVQRNYGFQIHTKEGEFTLSA^SGIRRNWIQTI 
MKHVHPTTAPDVTSSLPEEKNKSSCSFETCPRPTEKQEAELGEP 
DPEQKRSRARE\RRREGRSKTFDWAEFRPIQQALAQERVGGVGP 
ADTH\DPWRPEAEHGELERERARRREERRKRFGMLDATDGPGTE 
DAALRME VDRS PGLPMS DLKTHNVHVE I EQR WHQVETTPLREE K 
QVPIAPVHLSSEDGGDRLSTHELTSLLEKELEQSQKEASDLLEQ 
w K L»Liy uy JjK V ALGR E Q S AR EG YVLQATCERG FAAMEE THQKK I E 
DLQRQHQRELEKLREEKDRLLAEETAATISAIEAMKNAHREEME 
RELEKSQRSQISSVNSDVEALRRQYLEELQSVQRELEVLSEQYS 
QKCLENAHLAQALEAERQALRQCQRENQELNAHNQELNNRLAAE 
ITRLRTLLTGDGGGEATGSPLAQGKDAYELEVPSGARPCLTQLC 
TQEPQGSAAWPLSYRWGGTDLRQQESQGPGRSKSPEGGEEQ 


5725 


3 


1049 


VNGHSEETSQSPNRTEPHDSDCSVDLGISKSTEDLSPQKSGPVG 
SWKSHSITNMEIGGLKIYDILSDN\DLSSHLQPLK/FTSAVDG 
KNIVRSKAATLLYDQPLQVFTGSSSSSDLISGTKAIFKFDSNHN 
PE/GAKYNKRPHKWAHNLHLKYMVLHSIISNTVAV\RSQRHFVA 
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corresponding 
to first 
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amino acid 
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Predicted end 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I«Isoleucine, K-Lyaine, 
L=Leucine, M*»Methionine , N«Asparagine, 
P=Proline, Q«=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W -Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQTKSPNRPCQFSSSAPS / VDQRAQ/ INQSYAKHSANMNFSNHN 
NVRANTAYHLHQRLGPARHGEMWAI S PNDRLI PAVTRSTI QRQS 
SVSSTASVNLGDPGSTRRAQIPEGDYLSYREFHSAGRTPPMMPG 
SQRPLSART YS I DGPNASRPQSARPS INE I PERTMSVSDFNYSR 
TSP 


5726 


2 


486 


SRSLSMWWNSGLPASSHSSKLPVTVGFSGCVIOtLRLHGRPLGAP ' 
TRMAGVTPCILGPLEAGLFFPGSGGVITL/ESVGAGIPGPSRAG 
QGSPGGSGEGPPLSSPSQPLPALLPGATLPDVGLELEVRPLAVT 
GL I FHLGQARTPP YLQLQVTEKQVLLRADDG 


5727 


21 


221 


RPILILKETRRLPWATGYAEVINAGKSTHNEDQASCEVLTVKKK 
AGAVTSTPNRNS S KRRS S L PNGE 


5728 


2 


877 


GTRNGQFE PRRGRAWEGSAGGLRAPGAAAGGPGVQPRGSG / LPG 
NAIRAGVNPGRGPASPFWDLSLPWDLWPPPTDHAPGAPDFPAVE 
GR\PWAGGRPPWPVSGVLGSRVCGPLYSTSPAGPG/SGGLSPSQ 
GGPAGAGGDAG/LPGRCPSAPWRAGSRPAASCPDWIPGPQGLWL 
HRN PTS/GPPSQI GEGAEQGDEGVADAPQ I QCKN/ G AED P P AED 
EPPQVPEAGEEDAVPABEGPGGTPETQADQVRERPEAHLAEGGA 
KGSPRRLADPQDLPAGQMSLAPPFPPVAAVIRSNK 


5729 


1 


1525 


AGGAREVLTLQLGHFAGFVGAHWWNQQDAALGRATDSKEPPGEL ' ' 

CPDVLYRTGRTLHGQETYTPRLILMDLKGSIiSSLKEEGGLYRDK 

QLDAAIAWQGKLTTHKEELYPKNPYLQDFLSAEGVLSSDGVWRV 

KSIPNGKGSSPLPTATTPKPLIPTEASIRVWSDFLRVHLHPRSI 

CMIQKYNHDGEAGRLEAFGQGESVLKEPKYQEELEDRLHFYVEE 

CDYLQGFQILCDLHDGFSGVGAKAAELLQDEYSGRGIITWGLLP 

GPYHRGEAQRNIYRLLNTAFGLVHLTAHSSLVCPLSLGGSLGLR 

PEPPVSFPYLHYDATLPFHCSAILATALDTVTCS\YRLCSSPVS 

MVHL\ADMLSF-CGKKWTAGAIIPFPLAPGQSLPDSLMQFGGAT 

PWTPLSACGEPSGTRCFAQSWLRGIDRACHTSQLTPGTPPPSA 

LHACTTGEEILAQYLQQQQPGVMSSSHLLLTPCRVAPPYPHLFS 

SCS P PGMVLDGS P KGAAVE S VP VFG 


5730 


1258 


1713 


KKFQAPARETCVECQKTVYPMERLLANQQVFHISCFRCSYCNNK 
LSLGTYASLHGRIYCKPHFNQLFKSKGNYDEGFGHRPHKDLWAT 
KIETEGFWERPRNFENQGRPLKSPGGEDCPSC*GGCPGSNY*AQ 
GSSSREKGGQASWNPKLRVA 


5731 


122 


443 


RSHRGELIPKDSCYMRKPPRRPKKRRQG/CALPQGCLTFKDVAI 
EFSLEEWKCLNPAQRALYRAVMLENYRNLESVGLTSKDSWYMRK 
KPGRGRGKQRRQEWFFLRVY 


5732 


226 


772 


PPSRSCQSPRRKSRRRAHVTVTLVCGFTSFSFSLPLYLCGCLRF 
PER TCS QLQQAD WAPDFG P S S F VPS WGATATGARKFL I AFN I \N 
LLGTKEQAHRIALNLREQGRGKDQPGRLKKVQGIGWYLDEKNLA 
QVSTNLLDFEVTALHTVYEETCREAQELS LPWGS QLVGLVPLK 
ALLDAA 


5733 


1 


460 


PALQEWANAIAWGKQYENDARTLFEFTSGVNDTESPIIYRDES 
MRTACSPDGLCSDGNGLELKCPFTSRDFMKFRLGGFEAIKSAYM 
AQVQYSMWVTRKNAWYFANYDPRMKREGLHYWIERDEKYM\AS 
FDEI \VP\EFIGKMDEVLSRDPM 


5734 


3 


968 


RCNSPESLTSLLVLLTTANNLFVLIPAYSKNRAYAIFFIVFTVI ' 
GSLFLMNLLTAIIYSQFRGYLMKSLQTSLFRRRLGTRAAFEVLS 
SMVGEGGAFPQAVGVKPQNLLQVLQKVQLDSSHKQAMMEKVRSY 
GSVLLSAEEFQKLFNELDRSWKEHPPRPEYQSPFLQSAQFLFG 
HYYFD Y LGNL I ALANLVS I C VFLVLDAD VL P AERDD F I LGI LNC 
VFI VY YLLSMLLKVFALGLRGYLS YPSNVFDGLLTWLLVLE I S 
TLiWCTDCHTQAGGRRWW/RLLSLWDMTRMLNMIiIVFRFLRIIP 
S M KPMA WAS TVLGL 


5735 


2 


540 


F FTPC VARAFNF P DQAT VKKAAYS LPR VGGGTS CGLPQARR I SL 
ATPRQLYK/SSNMTQRWQRREISNFEYLMFLNTIAGRTYNDLNQ 
YPVFPWVLTNYESEELDLTLPGNFRDLSKPIGALNPKRAVFYAE 
RYETWEDDQSPPYHYNTHYSTATSTLSWLVRI VS I FIELACLWY 
LXILT 


5736 


1 


3 82 


GTRPSTKKSGYSPQQVAVIHCKGHQKENTAVAHSNQKADSAAQV 
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SEQ 
ID 
NO: 


predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


riculCLCU Cuu 

nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, ^Methionine, N=»Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, ^Threonine, V=Valine, 
W-Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 

Cod OTl /jsT^OCSClV^l *k 1 r"0 onf *1 /^Q f4a1a^4 — - — . 

/ -^UooiOiC IiUCXcU^J.Q6 UcloLlOIi, 

\=possible nucleotide insertion) 








TARLSVTPPNLLPTVSFPQPDLPDNPVYSTTTEKLASDLRANKN 
QES * * I LPDSG IFIP*T*TS YLQS TTHLRRAKLPQLLRR 


5737 


290 


1041 


tsjw^utiLiLjtta e Lii bWh ijrNPljljPUbLxSVEARSQRANLGPCRRKR 
LQTLMRLAAGFQYSSHKDPSLSAKEKHTDYHNEARGPWPGWVG* 
RTADGSCGRGPDGAHHPGPKSSSWRASRLLPGLGGSHHLDAYVG 
RDLECGTP APLQLE I P PQPRGHPAP I PTGQAGPRDSG PGAS P * V 
ETRPLTDGRR*PGVRPVGWTPAHPAGTLRPRGAVEPSVSACGKW 
APS PTSQGCCEGRCDAVPKHRAWRTPLCSQ 


5738 


8 


460 


DTLS LNCTL PETLPMTPS F* LS FL * FPGLARAKS I PTKTYSNEV 
VTLWYRPPDILLGSTDYSTQIDMW*GQVEVWQGPCGKGGGLVTT 
ATQPAAFLFTVPSLPRGVGCIFYEMATGRPLFPGSTVEEQLHFI 
FRILSEEAWALCAVETHR 


5739 


1 


1222 


S FQRRGIRWNVHTLHPHPRAVWAGIGRGHGS * ALLGRARAPALC 
FPTLLEFLESLEPDLPALRAMGLHLWAAGPGTHPAGISDLLAEV 
SAEVDGPVPGYLSSPQSITDTCLYIFTSGTTGLPKAARISHLKI 
LQ C QG F YQLCG VHQ ED V I YLAL PL YHMSG S LLG I VG CMG I GAT V 
VLKSKFSAGQFWEDCQQHRVTVFQYIGELCRYLVNQPPSKAERG 
HKVRLAVGSGLRPDTWERFVRRFGPLQVLETYGLTEGNVATINY 
TGQRGAVGRASWLYKHIFPFSLIRYDVTTGEPIRDPQGHCMATS 
PGEPGLLVAPVSQQSPFLGYAGGPELAQGKLLKDVFRPGDVFFN 
TRDLLVCDDQGFLRFHDRTGD P FRWKGENVATTE VAE V FEALD F 
LQEVNVYGVTV 


5740 


265 


231 


PAYWLKVPTLCLESKTDLREKASHVSAQLQGEVRGLAGALWM*A" 
Y VYER VYN * N I S RMVHALEQ KRH P AG LS S S MALQLNP CLGM LMA 
LQSELHKLYDEETQSWVSGSACGGYP 


5741 


1 


650 


PRKTMRRGVLMTLLQQSAMTLPLWIGKPGDRPPPLCGAIPASGD 
YVARPGDKVAARVKAVDGDEQWILAEWSYSHATNKYEVDDIDE 
EGKERHTLSRRRVIPLPQWKANPETDPEALFQKEQLVLALYPQT 
TCFYRALIHAPPQRPQDDYSVLFEDTSYADGYSPPLNVAQRYW 
ACKEPKKK * CRLADSPS PNDTGQDS RGRAGI KH I PPLKKK 


S742 


1 2 


362 


TQSVKE I L KRN PNVNLTD KDGNTALM I ASKEGHTE I VQDLLDAG 
TYVNIPDRSGDTVLIGAVRGGHVEIVRALLQKYADIDIRGQDNK 
TAL YWAVEKGNATMVRDI LQCNPDTE I CTKDG 


5743 


2 


415 


GKTPEGIDAIEEIEIDLEETEREISPQENGLEEVKPLGEMQTDL 
KATGRE I SPREKTPE VIDATEEIDKDLEETGRRE I SPEENGPEE 
VKPVDEMETDLKTTGREGSSREKTREVIDAABVIETDLEETERE 
ISPQE 


5744 


3 


703 


TRRTTTTSPTTTRQMTTTPAALPTTWTTPDLTTGTPLQMTTIA 
VFTTANTCLSLTPS TLPEEATGLLTPE PSKEGP I LTAES ETVL P 
SDSWSSAESTSADTVLLTSKESKVWDLPSTSHVSMWKTSDSVSS 
PQPGASDTAVPEQNKTTKTGQMDGI PMSMFCNEMP I SQLLMI IAP 
SLGFVLFALFVAFLLRGKLMETYCSQKHTRLDYIGDSKNVLNDV 
QHGREDEDGLFTL 


5745 


1400 


599 


GKSRFVNLMKHSKKTYDSFQDELEDYIKVQKARGLEPKTCFRKM " 

KGDYLETCGYKGEVNSRPTYRMFDQRLPSETIQTYPRSCNIPQT 

VENRLPQWLPAHDSRLRLDSLSYCQFTRDCFSEKPVPLNFNQQE 

YICGSHGVSHRVYKHFSSDNSTSTHQASHKQIHQKRKRHPEEGR 

EKSEEERSKHKRKKSCEEIDLDKHKSIQRKKTEVEIETVHVSTE 

KL KNR KE KKS RD WS KKEER KRTKKKKEQGQERTE EEMLWDQSI 

LGF 


5746 


3 


821 


S FASGRLTPSSPAFDGELDLORY^NnPAVQ AW<3^.^Mf:AWCWet ; c^ 
RAGERRFPCPVCGKRFRFNSILALHLRTHQPERPRSPAARLLLE 
LEERALLREARLGRARSSGGMQATPATEGLARPQAPSSSAFRCP 
YCKGKFRTSAERERHLHILHRPWKCGLCSFGSSQEEELLHHSLT 
AHGAPERPLAATSAAPPPQPQPQPPPQPEPRSVPQPEPEPQPER 
EATPTPAPAAPEEPPAPPEFRCQVCGQSFTQSWFLKGHMRKHKA 
SFDHACPV 


5747 


2 


1328 


DRHVETLCIHFLGPSTGSTAKTGGRNWLKTGNCLYGNTCRFVHG 
PSPRGKGYSSNYRRSPERPTGDLRERIKNKRQDVDTEPQKRKTE 
ESSSPVRKESSRGRHREKEDIKITKERTPESEEENVEWE'1*NRDD 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location , 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine , G=Glycine, 
H=Histidine, Islsoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=*Tryptophan, Y-Tyrosinc, X=Unknown, *=*Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S DNGD I NYD Y VHE LS LEM KRQ K I QRELMKLEQENMEKREEI I 1 K 
KEVSPEWRS KT..S PS PSLRKSSKS P KRKS S P KSS SAS KKDRKTS 
AVSSPLLDQQRMSKTNQSKKKGPRTPSPPPPIPEDIALGKKYKE 
K Y KVKD R I E E KTRDG KDRGRDFE RQREKRD KPRS TS PAGQ HHS P 
ISSRHHSSSSQSGSSIQRHSPSPRRKRTPSPSYQRTLTPPLRRS 
ASPYPSHSLSSPQRKQSPPRHRSPMREKGRHDHERTSQSHDRRH 
ERREDTRGKRDREKDSREEREYEQDQSSSRDHRDDREPRDGRDR 
RE 


5748 


934 


473 


SEGPQVFYKGLAPTLIAIFPYAGI^FSCYSSLKHLYKWAIPAEG 
KKNENLQNLLCGSGAGVISKTLTYPLDLFKKRLQVGGFEHARAA 
FGQ VRR YKGLM D CAKQ VLQ KEGALGF FKG L S P S LL KAAL S TG FM 
FFSYEFFCNVFHCMNRTASQR 


5749 


552 


1 


GFPVDPRVRGSTLSLAERPKGMIRSGSFRDPTDDVHGSVLSLAS " 
SASSTYSSAEERMQSEQIRKLRRELESSQEKVATLTSQLSANAN 
LVAAFEQSLVNMTSRLRHLAETAEEKDTELLDLRETIDFLKKKN 
S EAQAV I QGALNAS ETTP KELR I KRQNS SDS I S S LNS I TSHS S I 
GSSKDADA 


5750" 


22 


866 


IFISICLWNAHLCFLLLPKDCIDQVMKLQNLFVDDSGRYLAIQF 
HLE W A Y VFL Y Y YE Y RKAKDQLD I AKD I S QLQ I DLTGALG KRTR F 
QE NY VAQ L I LD VRREG D VLS NCE FT PAPTPQEHLT KNL E LNDDT 
ILNDIKLADCEQFQMPDLCAEEIAIILGICTNFQKNNPVHTLTE 
VELLAFTSCLLSQPKFWAIQTSALI LRTKLE KGSTRRVERAMRQ 
TQALADQFEDKTTSVLERLKIFYCCQVPPHWAIQRQLASLLFEL 
GCTSS ALQI FEKLEMWE 




3 


751 


SCGSALRAWRCGAAALATFPAPALPGLMYRAIiYAFRSAEPNALA 
FAAGETFLVLERSSAHWWLAARARSGETGYVPPAYLRRLQGLEQ 
DVLQAIDRAIBAVHNTAMRDGGFCYSLEQRGVLQKLIHHRKETLS 
RRGPSASSVAVMTSSTSDHHLDAAAARQPNGVCRAGFERQHSLP 
SSEHLGADGGLFQIPLPSSQIPPQPRRAAPTTPPPPVKRRDREA 
LMASGSGGHNTMPSGGNSVSSGSSVSSCI 


5752 


3 


471 


GPVCGVGLSVAWAGPWRGPVHSVGGGGRAALHGAELPCLSGAAT " 
VEREMELRHKNEMLRVETEARARAKAERENADI I REQIRLKAS E 
HRQTVLES I RTAGTLFGEGFRAFVTDRDKVTATVNI FI KQGWQV 
AERQHVGAS WS PRS C PCRLCTAL 


5753 


34 


483 


DDSXAI PGGVQAP FGAVRN I YTPRTGHR I R KLDQ I QSGGNYVAG 
GQEAFKKXjNYLDIGEIKKRPMEWNTEVKPVIHSRINVSARFRK 
PLQEPCTIFLIANGDLINPASRLLIPRKTLNQWDHVLQMVTEKI 
TLRSGAVHRLYTLEGRLV 


5754 


14 


331 


TLVHWEFAGEHAEAIASREQEVLQGWKEIjLSACEDARLHVSST 
ADALRFHSQVRDLLS WMDG I ASQ IGAADKPRCPS S LLGLPAS PW 
WPTPATPSPLTAPFSME 


5755 


3 


888 


LGDQFYKEAIEHCRSYNSRLCAERSVRIjPFLiDSQTGVAQNNCYI "' 

WMEKRHRGPGLAPGQL YTYPARCWRKKRRLHPPEDPKLRLLE I K 

PEVELPLKKDGFTSESTTLEALLRGEGVEKKVDAREEESIQEIQ 

RVLENDENVEEGNEEE DLE ED I PKRKNRTRGRARGS AGGRRRHD 

AASQEDHDKPYVCDICGKRYKNRPGLSYHYAHTHLASEEGDBAQ 

DQETRSPPNHRNENHRPQKGPDGTVIPNNYCDFCLGGSNMNKKS 

GRPEELVSCA0CGRSAHLGGEGRKEKEAAA 


5756 


3 


621 


S S KLQALFAH PLYNVPEEP PLLGAEDSLLASQEALR YYRRKVAR 
WNRRHKM YREQMNLTS LDP PLQLRLEAS WVQFHLG I NRHG L Y S R 
oof V VoiUjJUyiJMKHt V 1 XoADYoyuEKAIiLGACDCTQIVKPSGV 
HLKLVLRFSDFGKAMFKPMRQQRDEETPVDFFYFIDFQRHNAEI 
AAFHLDRILDFRRVPPTVGRIVNVTKEIL 


5757 


3 


473 


YKDALLLPDNHRQWFENGTLKLTDVQKGMDEGEYLCSVLIQPQ 
LSISQSVHVAVKVPPLIQPFEFPPASIGQLLYIPCWSSGDMPI 
RITWRKDGQVIISGSGVTIESKEFMSSLQISSVSLKHNGNYTCI 
AS NAAAT VS R E RQL I VR V P PR PW 


5758 


1 


474 


FRRGAGAERGEHREGERGAAGMG E FKVHRVRFFN YVPSGI RCVA 
YNNQSNRLAVSRTDGTVE I YNLS AN YFQEKFFPGHESRATEALC 
WAEGQRLFSAGLNGE IME YDLQALNI KYAMDAFGGPI WSMAASP 
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SEQ 
ID 
NO: 


Predicted, 
beginning 
nucleotide 
location 
correeponding 
to first 
amino acid 
residue of . 
amino acid 
sequence 


T) %» A H 1 /~> i~ a Dn J 

ficun.Leu eno. 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y«Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SGSQLLVGCEDGSVKLFQITPDKIPV 


5759 


2 


1240 


oiv^HjrtVjULi v v i ifiT^ HMiilJJjt'ij YTTNGTVHVVVNNQ IGFTTDPR 
MARSS P YPTDVARWNAPI FHVNADDPEAVI YVCS VAAEWRNTF 
NKDVGADLVCYRRRGHNEMDEPMFTQPLMYKQIHRQVPVLKKYA 
DKLIAEGTVTLQEFEEEIAKYDRICEEAYGRSKDKKILHIKHWL 
DSPWPGFFNVDGEPKSMTCPATGIPEDMLTHIGSVASSVPLEDF 
K I HTGLSR I LRGRADMTKNRTVD WALAE YMAFGSLLKEG I HVRL 
NGQDVERGTFSHRHHVLHDQEVDRRTCVPMNHLWPDQAPYTVCN 
SSLSEYGVLGFELGYAMASPNALVLWEAQFGDFHNTAQCIIDQF 
I S TGQAKWVRHNG I VLLL PHGMEGMG P EHS S ARP ER FLQM SNDD 
S DAYPAFTKD FE VSQL 


5760 


1 


1 991 


VRD I TSDSLS LSWTVPEGQFDHFLVQFKNGDGQPKAVRVPGHED 
GVTISGLEPDHKYKMNLYGFHGGQRVGPVSAVGLTAPGKDEEMA 
PASTEPPTPEPPIKPRLEELTVTDATPDSLSLSWTVPEGQFDHF 
LVQ YKNGDGQ P KATRVPGHEDRVT I SGLEPDNKYKMNL YG FHGG 
QRVGPVSAIGVTAAEEETPTPTEPSMEAPEPPEEPLLGELTVTG 
SSPDSLSLSWTVPQGRFDSFTVQYKDRDGRPQWRVGGEESEVT 
VGGLEPGRKYKMHLYGLHEGRRVGPVSTVGVTAPQEDVDETPSP 
TE PG TE AP E P P E E PLLG ELTVTGS S PD S LS LS WT VPQGR FDS FT 

VQYKDRDGRPQAVRVGGQESKVTVRGLEPGRKYKMHLYGLHEGR 
RLGPVSAIGVT 


" 57^1 


3 


1Z /b 


SCDMAEAAALVWIRGPGFGCKAVRCASGRCTVRDFIHRHCQDQN 
VPVENFFVKCNGALINTSDTVQHGAVYSLEPRLCGGKGGFGSML 
RALGAQ I EKTTNR EACRD LS G RRLRD VNHEKAMAE WVKQQAE RE 
AEKEQKRLERLQRKLVEPKHCFTSPDYQQQCHEMAERLEDSVLK 
GMQAAS S KMVSAE I SENRKRQWPTKS QTDRGASAGKRRCFWLGM 
EGLETAEGSNSESSDDDSEEAPSTSGMGFHAPKIGSNGVEMAAK 
FPSGSQRARWNTDHGSPEQLQIPVTDSGRHILEDSCAELGESK 
EHMESRMVTETEETQEKKAES KEP I EEEPTGAGLNKDKETEERT 
DGERVAEVAPEERENVAVAKLQESQPGNAVIEKETIDLLAFTSV 
AELELLGLEKLKCELMALGLKCGGTLQ 


5762 


2 


344 


GSTGQT PLHSQGGGGGSGGGRRRTPRGM PKEKYEP PD PRRM YT I 
mt» b il h> aaw L»KKS H WAE LE I S G KVRS LS AS LWSLTHLTALHLS DN 
SLSRIPSDIAKLHNLVYLDLSSNKIR 


5763 


3 


429 


LD KDTG L I ML IARLD YE L I QR FTLT 1 1 ARDG GGEETTGR VR I NV 
LD VNDNVPT FQ KDAY VG ALRENE PS VTQLVRLRATD E DS P PNNQ 
I T YS I VS AS AFGS Y FD I S L YEG YG V I S VS RPLD YEQ I SNGL I YL 
TVMAMDAGN 


5764 


19 


441 


VCARACGEMRQLLRPIDRQRYDENEDLSDVEEIVSVRGFSLBEK 
LRSQLYQGDFVHAMEGKDFNYE YVQREALRVPLI FREKDGLGI K 
MPDPDFTVRDVKLLVGSRRLVDVMDVNTQKGTEMSMSQFVRYYE 
TPEAQRDKL 


5765 


3 


825 


Q K I LRLNNS HQ P PT S S SNS KDCGG PASS G AGATAALADG L KFAS 
VQ AS APQGNS H KET S KS KVKR S KTS KDANKS L P S AAL YG I P E I S 
STGKRQEVQGRPGEATGMNSALGQSVSSGGSGNPNSNSTSTSTS 
AATAGAGSCGKSKEEKPGKSQSSRGAKRDKDAGKSRKDKHDLLQ 
GHQNGSG SQ APSGGHLYGFGAKSNGGGASPFHCGGTGSGSVAAA 

GEVSKSAPDSGLMGNSMLVKKEEEEEESHRRIKFGLiKTEKVDPLF 
TVPAPPPHV ; 


" 576(? 


1606 


663 


SGLFSVDPASSQAMELSDVTLIEGVGNEVMWAGVWLILALVL 
AWLSTYVADSGSNOLLGAI VSAGDT^VI .HT.r;uvnuT A/Tirrir-vrnc 

PTELPHPSEGNDEKAEEAGEGRGDSTGEAGAGGGVEPSLEHLLD 
IQGLPKRQAGAGSSSPEAPLRSEDSTCLPPSPGLITVRLKFLND 
TEELAVARPEDTVGALKSKYFPGQESQMKLIYQGRLLQDPARTL 
RSLNITDNCVIHCHRSPPGSAVPGPSASLAPSATEPPSLGVNVG 
SLMVPVFWLLGWWYFRINYRQFFTAPATVSLVGVTVFFSFLV 
FGMYGR 


5767 


2 


892 


NFRATPRPPTRPELRTGTEVILWYLDWRALMKRKRMKANIKLVG 
SG FPLPSSDLDDSLTEE I DEKIGFRNDANFDWQNVADFRDAGGS 
LTEVKVEEEERDPQS PE FE I EEEEEMLS S VI PDSRRENEL PDFP 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment, containing signal peptide 
(A=Alanine, C=Cyateine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P«Proline, Q=Glutamine, R^Arginine, 
S«Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








hidefftlnstpsrsaVdephllvniekqklelekrrldieaer 
lqve kerlq i e kerlrhldmeherlqlekerlq iere klrlqi v 
nsekpslenelgqgeksmlqpqdieteklklererlqlekdrlq 
flkfeseklq i ekerlq vekdrlr i qkeghlq 


5768 


3 


476 


SSRSRLSVSVSPPPPGIVELGPPFAWEFCSRLGSAVTSQRAGPA 
AAMVAKDYPFYLTVKRANCSLELPPASGPAKDAEEPSNKRVKPL 
SRVTSLANL I PPVKATPLKRFSQTLQRS I S FRSESRPDI LAPRP 
WSRNAAPSSTKRRDSKLWSETFDVC 


5769 


38 


667 


TKTKKGVKEKATDQSVKAFTuUHCPELQYVGFMGCSVTSKGVIHL 
TKLRNDSSLDLRHITEIiDNETAMEIVKRCKNLISLNLCLNWIIN 
DRCVEVIAKEGQNLKELYLVSCKITDYAIilAIGRYSMTIETVDV 
GWCKE I TDQGATLIAQSS KSLRYLGLMRCDKVNEVTVEQLVQQY 
PHITFSTVLQDCKRTLERAYQMGWTPNMSAASS 


5770 


1 


484 


DSRRYDVKTRKWSFLLEEHSKLIAKVRCLPQVQLDPLPTTLTLA 
FASQLKKTSLSLTPDVPEADLSEVDPKLVSNLMPFQRAGVNFAI 
AKGGRLLLADDMGLGKTIQAICIAAFYRKEWPLLVWPSSVRFT 
WEQAFLRWLPSLSPDCINVWTGKDRLTA 


5771 


168 


741 


GLLPSACLRARSWREASEGPSSRACSNGSQDTFEACYSGTSTPS 
FHGSHCSGS DHS S U3LEQLQDYM VTLRSKLG PLE IQQ FAMLLRE 
Y RLGL P I QD Y CTGLL KL YG DRR KFLLLGM R P F I P DQD I G Y FEG F 

LEGVGIREGGILTDSFGRIKRSMSSTSASAVRSYDGAAQRPEAQ 
AFHRLLAD I TH DIE 


5772 


148 


383 


EFNLALVSPSHPQIKAEDDQPLPGVLLSLSGGLFRSNLLTQDNG 
ILTFSNLVTCSAIYHLPVFPEREPGCSMRDLRVA 


5773 


2 


723 


PRVRSKHNFCFMEMNTRLQVEHPVTEMITGTDLVEWQLRIAAGE 
KIPLSQEEITLQGHAFEARIYAEDPSNNFMPVAGPLVHLSTPRA 
DPSTRIETGVRQGDEVSVHYDPMIAKLWWAADRQAALTKLRYS 
LRQYWIVGLHTNIDFLLNLSGHPEFEAGNVHTDFIPQHHKQLLL 
SRKAAAKESLCQAALGLILKEKAMTDTFTLQAHDQFSPFSSSSG 
RRLN ISYTRNMTLKDGKNS K 


5774 


2 


592 


FVEEENIRWRCGGSELNFRRAVFSADSKYIFCVSGDFVKVYST 
VTE E CVH I LaHGHRN L VTG I Q LNPNNH LQL YS CSLDGTIKLWDYI 
DGILI KTFI VGCKLHALFTLAQAEDS VFVIVNKE KPDI FQLVS V 
KLPKSS S QE VEAKELS FVLD Y INQS P KCI AFGNEGVYVAAVRE F 
YLSVYFFKKETTS RVTLSS S 


5775 


3 


538 


SSGCCDPAAPSSLAEAATMPVSKCPKKSESLWKGWDRKAQRNGL 
RSQVYAVNGDYYVGE WKDNVKHGKGTQVWKKKGA I YEGDWKFG K 
RDGYGTLSLPDQQTGKCRRVYSGWWKGDKKSGYGIQFFGPKEYY 
EGDWCGSQRSGWGRMYYSNGDIYEGQWENDKPNGEGMLRLSQNP 
RP 


5776 


2 


484 


RLPQDCVCQNLSESLGTLCPSKGLLFVPPDIDRRTVELRLGGNF 
IIHISRQDFANMTGLVDLTLSRNTISHIQPFSFLDLESLRSLHL 
DSNRL P S LG E DTLRGL VNLQHL I VNNNQLGG I AD E AFEDFLLTL 
ED LDLS YNNLHG PAVG LRGDAW VQ PS TS 


sin 


2 


949 


GQDPEPGQDLFQPEREVDPSWGRGREPRLGKLRFQNDHLSVLKQ 
VKKLEQALKDGSAGLDPQLPGTCYSPHCPPDKAEAGSTLPENLG 
GGSGSEVSQRVHPSDLEGREPTPBLVEDRKGSCRRPWDRSLENV 
YRGSEGSPTKPFINPLPKPRRTFKHAGEGDKDGKPGIGFRKEKR 
NLPPLPSLPPPPLPSSPPPSSVNRRLWTGRQKSSADHRKSYEFE 
DLLQSSSESSRVDWYAQTKLGLTRTLSEENVYEDILDPPMKENP 
YEDIELHGRCLGKKCVLNFPAS PTSSI PDTLTKQSLS KPAFFRQ 
NSERRNV 


5778 


1 


1210 


QRRQSVSRLLLPVFLLEPPAEPGLEPPPEEEGGEPAGVAEEPGS 
GGPCWLQLEEVPGPGPLGGGGPLRSPSSYSSDELSPGEPLTSPP 
WAPLGAPERPEHLLNRVLERLAGGATRDSAAS DILLDD I VLTHS 
LFLPTEKFLQELHQYFVRAGGMEGPEGLGRKQACLAMLLHFLDT 
YQGLLQEE EGAGH 1 1 KDL YLL I M KDESL YQGLREDTLRLHQL VE 
TVELKIPEENQPPSKQVKPLFRHFRRIDSCLQTRVAFRGSDEIF 
CRVYMPDHS YVT I RSRLSASVQD I LGS VTEKLQ YSE E PAGREDS 
LILVAVSSSGEKVLLQPTEDCVFTALGINSHLFACTRDSYEALV 
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SEQ 
ID 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepti<3e~~ 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=*Histidine, I»Isoleucine, K^Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLPEEIQVSPGDTEIHRVEPEDVANHLTAFHWELPRCVHELEFV""" 
DYVFHGE 


5779 


138 


1671 


EAVQVLIKHSADVNARDKNWQTPLHVAAANKAVKCAEVIIPLLS 
SVNVSDRGGRTALHHAALNGHVEMVNLLLAKGANINAFDKKDRR 
ALHWAAYMGHLDWALLINHGAEVTCKDKKGYTPLHAAASNGQI 
NWKHLLNLGVE I DE I NV YGNTALHI AC YNGQDAWNE L I DYGA 
NVNQPNNNGFTPLHFAAASTHGALCLELLVNNGADVNIQSKDGK 
SPLHMTAVHGRFTRSQTLIQNGGEIDCVDKDGNTPLHVAARYGH 
ELLINTLITSGADTAKCGIHSMFPLHLAALNAHSDCCRKLLSSG 
Q KYS I VS L FS NEHVL S AG F E I DT P D KFGRTCLHAAAAGGNVEC I 
KLLQSSGADFHKKDKCGRTPLHYAAANCHFHCIETLVTTGANVN 
ETDDWGRTALHYAAASDMDRNKTILGNAHDNSEELERARELKEK 
EATLCLEFLLQNDANPS I RDKEGYNS IHYAAAYGHRQCLELLLE 
RTNSGFEESDSGATKSPLHLAVSEMP 


5780 


154 


624 


QFFRVITCLPFKGPDYRLYKSEPELTTVAEVDESNGEEKSEPVS 
E I ETS WKGSHFPVGWPPRAKS PTPESSTIAS YVTLRKTKKMM 
DLRTERPRSAVEQLCLAESTRPRMTVEEQMERIRRHQQACLREK 
KKQLNVIGASDQSPLQSPSNLRDNP 


5781 


19 


941 


RGSLGGHPWRPPMRAASQGCLPVSFVTGPHQERAYGGRGPGGAF 
PAPPVSGTCPPDLIYAPTPEKAEGGSQKNHQPPPGERAAHRDGE 
QAPCRAGPTRKVAVAPRPPSCP+GPE\PGEBPRRPLDRSPPLGQ 
VQPHFTSQDAKSAEDEAPSRHLGKHQPRSAQVGSRLDALQGPKT 
QHSIHTVTCKSPRQKEDRSPKPPQAPKHPEEHGRQS\QAPPPLP 
VAPSRTCGGC * T WDPALLVS P / PQGDSTPELPAP \QQPTGGPS R 

CRQALPPQG*RQQPRQRPR/PTGASRSHPAKAKGCQGPPKIRNY 
NIMD 


5782 


5176 


1237 


DRSMMSMAADS YTDS YTDT YTSAYMVP PLPPEE P PTMPPLP PEE 

PPMTPPLPPEEPPEGPALPTEQSALTAENTWPTEVPSLPSEESV 

SQPEPPVSQSEISEPSAVPTDYSVSASDPSVLVSEAAVTVPEPP 

PEPESSITLTPVESAWAEEHEWPERPVTCMVSETPAMSAEPT 

VLASEPPVMSETAETFDSMRASGHVASEVSTSLLVPAVTTPVLA 

ESILEPPAMAAPESSAMAVLESSAVTVLSSSTVTVLESSTVTVL 

EPSWTVPEPPWAEPDYVTIPVPWSALEPSVPVLEPAVSVLQ 

PSMIVSEPSVSVQESTVTVSEPAVTVSEQTQVIPTEVAIESTPM 

ILESSIMSSHVMKGINLSSGDQNLAPEIGMQEIALHSGEEPHAE 

EHLKGDFYESEHGINIDLNINNHLIAKEMEHNTVCAAGTSPVGE 

IGEEKILPTSETKQRTVLDTYPGVSEADAGETLSSTGPFALEPD 

ATG \ T S KG I E FTTAS TL SL VNK YD VDLS LTTQDTEHDM L I S TS P 

SGGSEADIEGPLPAKDIHLDLPSNINLVSSDTNEPI,PVKRD\DQ 

TLiAALI \SLKESSGGEKEVPPPS *REHLPDSGFSANIEDINEAD 

LVRPVSSPRTWNVLPSPRAGL\EGP\LLASDFGPVQNLYSSPW 

\SSMP\ERASGS\SSGEKGG\YEIFVKVKDTHEKSKKNKNRDKG 

EKEKKRDSSLRSRSKRSKSSEHKSRKLTSESRSRARKRSSKSKS 

H R S \ QTRS RS RS / RDRRRRS S RS RS KS RGRRS VS KE KRKRS P KH 

RSKSRERKRKRSSSRDNRKTVRARSRTPSRRSRSHTPSRRRRSR 

SVGRRRS FSISPSRRSRTPSRRSRTPSRRSRTPSRRSRTPSRRS 

RTPSRRSRTPSRRRRSRSWRRRSFSISPVRLRRSRTPLRRRFS 

RS P I RRKRS RS S ERGRS P KRLTD1*DKAQLLE I AKANAAAMCAKA 

GVPLPPNLKPAPPPTIEEKVAKKSGGATIEELTEKCKQIAQSKE 

DDDVIVNKPHVSDEEEEEPPFYHHPFFOjSEPKPIFFNLNIAAAK 

PTPPKSQVTLTKEFPVSSGSQHRKKEADSVYGEWVPVEKNGEEN 

KDDDNVFS SNLPS EP VD I STAMS ERALAQKRLS ENAFDLEAMSM 

LNRAQERIDAWAQLNS I PGQFTGSTGVQVLTQEQLANTGAQAWI 

KKDQFLRAAPVTGGMGAVLMRKMGWREGEGLGKNKEGNKEPILV 

DFKTDRKGLVAVGERAQKRSGNFSAAMKDLSGKHPVSALMEICN 

KRRWQPPEFLLVHDSGPDHRKHFLFRVLINGSAYQPNCMFFLNR 

Y 


5783 


1693 


698 


DSGLRVAFTMEGISNFKTPSKLSEKKKSVLCSTPTINIPASPFM 
QKLGFGTGVNVYLMKRSPRGLSHSPWAVKKINPICNDHYRSVYQ 
KRLMDEAKILKSLHHPNIVGYRAFTEANDGSLCLAMEYGGEKSL 
NDLIEE/PI*SQ/PKILFQQP/LILKVALNMARGLKYLHQEKKL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K«Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P«Proline, Q»Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=*Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LHGD I KS S NW IKGDFETIKIC D VG VS L PLDENMT VTD PE AC Y I 
GTEPWKP KEAVEENGVI TDKAD I FAFGLTLWEMMTLS I PHINLS 
NDDDD E DKT FDES D FDDE A Y Y AALGTRP P I NMEELDES YQKVI E 
LFSVCTNEDPKDRPSAAHIVEALETDV 


3 /OH 


2669 


1388 


PRVRPRVRTDHNYYISRIYGPSDSASRDLWVNIDQMEKDKVKIH 
G I LSNTHRQAARVNLS FDFP F YGHFLRE I TVATGGF I YTGEWH 
RMLTATQ Y I APLMANFD PS VSRNS TVR YFDNGTALWQWDHVHL 
QDN YNLGS FT FQATLLMDGR 1 1 FG YKE I PVL VTQ I S STNHP VKV 
GLSDAFVVVHRIQQIPNVRRRTIYEYHRVELQMSKITNISAVEM 
TPLPTCLQFNRCGPCVSSQIGFNCSWCSKLQRCSSGFDRHRQDW 
VDSGCPEESKEKMCENTEPVET\FLEPPQP+ERQPPSSGS*LPP 
E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
HAGLIVGILILVLIVATAILVTVYMYHHPTSAASIFFIERRPSR 
WPAMKFRRGSGHPAYAEVEPVGEKEGFIVSEQC 


5785 


2669 


1388 


PRVRPRVRTDHNYYISRIYGPSDSASRDLWVNIDQMEKDKVKIH 
G I LSNTHRQAARVNLS FDFP F YGHFLRE I TVATGG F I YTGEWH 
RMLTATQ Y I APLMANFDPSVSRNSTVRYFDNGTALVVQWDHVHL 
QDNYNLGSFTFQATLLMDGRIIFGYKEIPVLVTQISSTNHPVKV 
GLSDAFVWHR I QQ I PNVRRRT I YE YHRVELQMSKI TNI SAVEM 
TPLPTCLQFNRCGPCVSSQIGFNCSWCSKLQRCSSGFDRHRQDW 
VDSGCPEESKEKMCBNTEPVET\FLEPPQP*ERQPPSSGS*LPP 
E/DAVTSQFPTSLPTEDDTKIALHLKDNGASTDDSAAEKKGGTL 
HAG L I VG I L I L VLI VATA I LVTV YM YHH PTS AAS IFFIERRPSR 
W P AMK FRRG S GH P A YAE VE P VGE KEG F I VS EQ C 




2532 


1674 


S YKLPAAE RRAS S CS Q P PTPTRRRW PAPGRTS RGHR PQM * S GTP ' 

APRPPARSTVSPASPLPKPRAGRCGSRPRSACSTFRPC*SLN*M 

S*H*KRNLSQRSSSMSRRPLSCARPHR**RQGLTVAARLPTWAK 

SPPLACSFCQAAQKSQSLSSGRSTR*PERMSFRP\SPPGNPAIP 

SLAPSSRP/PKGRPQCTWIPSRWPASPTAPPTTT*APTSSPGST 

GRSMMTCPTRWTATPWSARASSRPRNWPTP*WRPSGRLSTV*RA 

TGGS TATA P P KR F PRNWNPMMAE 


5787 


2 


1460 


MASAAS VTSLADEVNCP \ I CQGTLKEAGSLSNCG/HKNFCRACL 
T\RYCEIP\GPD\LEESP\TCP\LCKEPFRP\GSFRPNWQLANV 
VEN I ER LQLVSTLGLGE E DVCQEHG E K I Y F FCEDD EMQ LCWCR 
EAGEHATHTMRFLEDAA\APYREQIHKCLKCLIKEREEIQEIQS 
RENKRMQ VLLTQVS TKRQQVI SE FAHLRKFLEEQQS I LLAQLES 
QDGDILRQRDEFDLLVAGEICRFSALIEELEEKNERPARBLLTD 
I RS TL I R CETRKCR KP VAVS P ELGQR I R DFP QQ AL PLQREM KM F 
LEKLCFELDYEPAHISLDPQTSHPKLLLSEDHQRAQFSYKWQNS 
PDNPQRFDRATC VLAHTG I TGGRHTW WS I DLAHGGS CTVG WS 
EDVQR KG E LRLR P EEGVWAVRLAWG F VSALG S FP \ TRLT L KEQ P 
RQVRVSLDYEVGWVTFTNAVTREPIYTFTASFTRKVIPFFGLWG 
RGSSFSLSS 


5788 


2 


6860 


EHSVSGRSSAYGDATAEGHPAGPGSVSSSTGAISTTTGHQEGDG"~ 

SEGEGEGETEGDVHTSNRLHMVRLMLLERLLQTLPQLRNVGGVR 

AIPYMQVILMLTTDLDGEDEKDKGALDNLLSQLIAELGMDKKDV 

SKKNERSALNEVHLWMRLLSVFMSRTKSGSKSSICESSSLISS 

ATAAALLSSGAVDYCLHVLKSLLEYWKSQQNDEEPVATSQLLKP 

HTTSS PPDMS PFFLRQ YVKGHAADVFEAYTQLLTEMVLRLP YQI 

KKITDTNSRIPPPVFDHSWFYFLSEYLMIQQTPFVRRQVRKLLL 

F I CG S KE K YRQLRDLHTLDS \ H VRG I KKLLEEQG I FLRAS WTA 

S PQSALQ YDTLI S LMEHLKACAE I AAQRTINWQKFC I KDDSVLY 

FLLQVSFLVDEGVSPVLLQLLSCALCGSKVLRALAASSGSSSAS 

SSPAPVAASSGQATTQSKSSTKKSKKEEKEKEKDGETSGSQEDQ 

LCTALVNQLNKFADKETLIQFLRCFLLESNSSSVRWQAHCLTLH 

IYRNSSKSQQELLLDLMWSIWPELPAYGRKAAQFVDLLGYFSLK 

TPQTEKKLKEYSQKAVEILRTQNHILTNHPNSNIYNTLSGLVEF 

DGYYLESDPCLVCNNPEVPFCYIKLSSIKVDTRYTTTQQWKLI 

GSHTISKVTVKIGDLKRTKMVRTINLYYNNRTVQAIVELKNKPA 

RWHKAKKVQLTPGQTEVKIDLPLPIVASNLMIEFADFYENYQAS 

TETLQCPRCSASVPANPGVCGNCGENVYQCHKCRSINYDEKDPF | 
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ID 
NO: 


Predicted 
beginning 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Pfprl i c t" t=»f5 pnH 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F=Phenylalanine / G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=L.eucine, MsMethionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LCNACGFCKYARFDFMLYAKPCCAVDPIENEEDRKKAVSNINTL ' 

liDKADRVYHQLMGHRPQLENLLCKVNEAAPEKPQDDSGTAGG I S 

STS ASVNRY X LQLAQE YCGDCKNS FDELS K 1 1 QKVFASRKELLE 

YDLQQREAATKSSRTSVQPTFTASQYRALSVLGCGHTSSTKCYG 

CASAVTEHCITLLRALATNPALRHILVSQGLIRELFDYNLRRGA 

AAMREEVRQLMCLLTRDNPEATQQMNDLIIGKVSTALKGHWANP 

DLASSLQYEMLLLTDS ISKED5 CWELRLRCALSLFLMAVNI KTP 

WVENI TLMCLRI LQKL I KP PAPTS KKNKDVPVEALTT VKPYCN 

E I HAQ AQLW LKRD P KAS YD AW KKCL P I RG I DGNGKAP S KS ELRH 

LYLTEKYVWRWKQFLSRRGKRTSPLDLKLGHNNWLRQVLFTPAT 

QAARQAACTIVEALATIPSRKQQVLDLLTSYLDELSIAGECAAE 

YLALYQKLITSAHWKVYIAARGVLPYVGNLITKEIARLLALEEA 

TLSTDLQQGYALKSLTGLLSSFVEVESIKRHFKSRLVGTVLNGY 

LCLRKLWQRTKLIDETQDMLLEMLEDMTTGTESETKAFMAVCI 

ETAKRYNLDDYRTPVFIFERLCSIIYPEENEVTEFFVTLEKDPQ 

QEDFLQGRMPGNPYSSNEPGIGPLMRDIKNKICQDCDLVALLED 

DSGMELLVNNKIISLDLPVAEVYKKVWCTTNEGEPMRIVYRMRG 

LLGDATEEFIESLDSTTDEEEDEEEVYKMAGVMAQCGGLECMLN 

RLAGIRDFKQGRHLLTVLLKLFSYCVKVKVWRQQLVKLEMNTLN 

VMLGTLNLALVAEQES KDSGGAAVAEQVLS I ME I \ I QAE PNVEP 

LSEDKGNLLLTGDKDQLVMLLDQINSTFVRSNPSVLQGLLRIIP 

YLSFGEVEKMQILVERFKPYCNFDKYDEDHSGDDKVFL\DCFCK 

IAAGIK\NNSNGHQL\KDL\ILQKGITQNALD\YMKKHIP/SAA 

RIWDADINWKSFCLRPALPFILRLLRGLAIQHPGTQVLIGTDSI 

PNLHKLEQVS\SDEGIGTLA\ENL\LESLREHPDVNKKIDA\AR 

RETRAEKKRMAM AMRQKALGTLG \ MTTNEKGQ WD / TRTALLEA 

DWEELIEEP\GLTCCICREGYKFQPTKVLGIYTFTKRWLGGVW 

ENKPRETSRATSTVSHFNIVHYDC\HLA\AVSLARGREEWESAA 

LQNANTKCNGLLPVWGPHVPESAFATCLARHNTYLQECTGQREP 

TYQLNIHDIKLLFLRFAMEQSFSADTGGGGRESNIHLIPYIIHT 

GLYVLNTTRATSREEJCNLQGFLEQPKEKWVESAFEVDGPYYFTV 

LALH I LP PEQ WRATRVE I LRRLLVTSQARAVAPGGATRLTD KAV 

KDYSAYRSSLLFWALVDLIYNMFKKVPTSNTEGGWSCSLAEYIR 

HNDMP I YEAADKALKTFQEEFMPVETFSEFLDVAGLLSE ITDPE 

SFLKDLLNSVP 


5789 


1 


2407 


LPLHAVEKTGRPGQPALKMPGKLRSDAGXiESDTAMKKGETLRKQ 
TEEKEKKEKPKSDKTEEIAEEEETVFPKAKQVKKKAEPSEVDMN 
SPKSKKAKK\KEEPSQNDISPKTKSLRKKKEPIEKKWSSKTKK 
VTKNEEPSEEEIDAPKPKKMKKEKEMNGETREKSPKliKNGFPHP 
EPDCNPSEAASBESNSEIEQEIPVEQKEG\AFSNFPISEETIKL 
LKGRGVTFLFPIQAKTFHHVYSGKDLIAQARTGTGKTFSFAIPL 
I E KLHG \ E LQDR KRGRAPQ VLVLAPTRE LANQVS KDFSDITKKL 
S VAC F YGGTP YGGQ FERMRNG I D I L VGT PGRI KDH I QNGKLDLT 
KLNHVVLDEVDQMLDMGFADQVEEILSVAYKKDSEDNPQTLLFS 
ATCPHW VFNVAKK YMKS TYE Q VD L I G KKTQKTA I TVEHLAI KCH 
WTQRAAVIGDVIRVYSGHQGRTI I FCETKKEAQELSQNSAIKQD 
AQSLKGD I PQKQRE I TLKGFRNGS FGVLVATNVAARGLD IPEVD 
L VI QS S P PKD VE S Y I HRSGRTGRAGRTG VC I CFYQH KEE YQLVQ 
VEQKAG I KFKRIGVPSATE 1 1 KASSKDAI RLLDS VP PTAI SHFK 
QSAEKLIEEKGAVEALAAALAHISGATSVDQRSLINSNVGFVTM 
ILQCS I E M PN I S YAW KE LKEQLG E E I DS KVKGMVF L KG KLGVCF 

DVPTASVTF T ORVUHn CD PUOT.CT/ZV r TI?rsDt?T PPnnp^vnn cm/->/% 
v i ifwv i u j.ycjrs.wnjJDKi\WVJjo v/\i Jrfcj_ic,(jFK.EG YGGFRGQ 

REGSRGFRGQRDGNRRFRGQREGSRGPRGQRSGGGNKSNRSQNK 
GQKRSFSKAFGQ 


5790 


3786 


1585 


ARRQRDP LQALRRRNQELKQQVDS LLSESQLKEALE PNKRQH I Y 
QRC I QLKQ A I D ENKN ALQ KLS KAD E S AP VAN YNQRKE EEHTLLD 
KLTQQLQGLAVTISRENITEVGAPTEEEEESESEDSEDSGGEEE 
DAEEEEE E KEENESHKWSTGEE Y I AVGDFTAQQVGDLTFKKGE I 
LLVIEKKPDGWWIAKDAKGNEGLVPRTYLEPYSEEEEGQESSEE 
GS EEDVEAVDETADGAEVK\QRTDPHWSAVQKAI SEAG I FCLVN 
HVS FCY L I VLMRNRME T VEDTNG S E TG FRAWNVQ SRGR I FL VS K 
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SEQ 
ID 
NO: 


p yp> H "i nt" far? 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V«Valine, 
W*Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








fVLjUUiW 1 VLIVLl 1 MGA1 PAG KkPSTbSULLttttGNQ FRAN YFLQ 
PELMPSQLAFRDLMWDATEGTIRSRPSRI SLILTLWSCKMI PLP 
GMSIQVLSRHVRLCLFDGNKVLSNIHTVRATWQPKKPKTWTFSP 
QVTRILPCLLDGDCFIRSNSASPDLGILFELGISYIRNSTGERG 
ELSCGWVFLKLFDASGVPIPAKTYELFLNGGTPYEKGIEVDPSI 
SRRAHGSVFYQIMTMRRQPQLLVKLRSLNRRSRNVLSLLPETLI 
GNMCSIHLLIFYRQILGDVLLKDRMSLQSTDLISHPMLATFPML 
LEQPDVMDALRSSWAGQES\TLKRSEKR\PKEFLKVPRFLLVYH 
\GCVLPLL/HTPTRLPPFRWAEEETETARWKVITDFLKQNQENQ 
GALQALLSPDGVHEPFDLSEQTYDFLGEMRKNAV 


5791 


3 


1636 


LRVAE FAGT S R / IGAGLIQPLHRAPARDHGLLRGGAAPALSVSH ' 

GN/GKQL/ AMSSQGSDDEQIKRENIRSLTMSGHVGFESLPDQLV 

NRSIQQGFCFNILCVGETGIGKSTLIDTLFNTNFEDYESSHFCP 

NVKLKAQTYELQESNVQIjKLTIVNTVGFGDQINKEESYQPIVDY 

IDAQFEAYLQEELKIKRSLFTYHDSRIHVCLYFISPTGHSLKTL 

DLLTMKNLDSKVYI I PVIAKADTVSKTELQKFKIKLMSELVSNG 

VQIYQFPTDDDTIAKVWAAMNGQLPFAWGSMDEVKVGNKMVKA 

RQYPWGWQVENENHCDFVKLREMLICTNMEDLREQTHTRHYEL 

YRRCKLEEMGFTDVGPENKPVSVQETYEAKRHEFHGERQRKEEE 

MKQMFVQRVKEKEAILKEAERELQAKFEHLKRLHQEBRMKLEEK 

RRLLEEEIIAFSKKKATSEIFHSQSFLATGSNLRKDKDRKNSQF 

FVKQKVPEHRRS SSQANF I KKKLEVCFDFAVI CFITS I FGEQPQ 

LLIFMEKYFQVQGQYISQSE 


5792 


2263 


653 


AAAAP S PAW W CG VF WYWHTC W VM YG I V YTRP CS GD AS C I QP Y 
LARRP KLQL \ RHS FT TTR S HLG AENN I DL VLNVE DFD VE S KFER 
TVNVS VPKKTRNNGTL YAYI FLHHAGVLPWHDGKQVIILVS PLTT 
YMVPKPEE I NLLTGE SDTQQIEADKKPTSALDE PVSHWRPRIAL 
NVMADNFV FDGS S LP AD VHR YMKM I QLGKT VH YL P I L F I DQLSN 
RVKDLMVI NRS T TEL PLT VS YD KVS LGRLR FW I HMQDAVYS LQQ 
FGFSEKDADEVKGIFVDTNLYFLALTFFVAAFHLLFDFLAFKND 
ISFWKKKKSMIGMSTKAVLWRCFSTWIFLFLLDEQTSLLVLVP 
AGVGAAI ELWKVKKALKMTI FWRGLMPEFQFGTYS ESERKTEEY 
DTQAMKYLSYLLYPLCVGGAVYSLLNIKYKSWYSWLINSFVNGV 
YAFGFLFMLPQLFVNYKLKSVAHLPWKAFTYKAFNTFIDDVFAF 
IITMPTSHRLACFRDDWFLVYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD 


5793 


2263 


653 


AAAAP S PAWWCGVFWYWHTCWVMYGIVYTRPCSGDASCIQPY ' 
LARRP KLQL\RHSFTTTRSHLGAENNIDLVLNVEDFDVES KFER 
TVNVS VP KKTRNNGTLYAY I FLHHAGVLPWHDGKQVHLVSPLTT 
YMVPKPEEINLLTGESDTQQIEADKKPTSALDEPVSHWRPRLAL 
NVMADNFVFDGS SLPADVHR YMKMIQLGKTVHYLP ILFIDQLSN 
RVKDLMVINRSTTELPLTVSYDKVSLGRLRFWIHMQDAVYSLQQ 
FGFSEKDADEVKG I FVDTNLYFLALTFFVAAFHLLFDFLAFKND 
I SFWKKKKSMIGMSTKAVLWRCFSTWI FLFLLDEQTSLLVLVP 
AGVGAAI ELWKVKKALKMTI FWRGLMPEFQFGTYS ESERKTEEY 
DTQAMK YLS YLL YPLCVGGAVYS LLNI K YKS W YS WL INSFVNG V 
YAFG FLFM L PQLF VNYKLKS VAHLPWKAFT YKAFNTFI DDVFAF 
IITMPTSHRLACFRDDWFLVYLYQRWLYPVDKRRVNEFGESYE 
EKATRAPHTD 


5794 


1 


5016 ; 


MGPRLSVWLLLLPAALLLHEEHSRAAAKGGCAGSGCGKCDCHGV " 

KGQKG E RGL PGLQGV I G FPGMQG P EG PQG P PGQKGDTGE PGLPG 

i iua i K<a v f\yi\^{j x PGNPGLPG I PGQDG P PGP PG I PG CNGTKGER 

GPLGPPGLPGFAGNPGPPGLPGMKGDPGEILGHVPGMLLKGERG 

FPGIPGTPGPPGLPGLQGPVGPPGFTGPPGPPGPPGPPGEKGQM 

GLSFQGPKGDKGDQGVSGPPGVPGQAQVQEKGDFATKGEKGQKG 

EPGFQGMPGVGEKGEPGKPGPRGKPGKDGDKGEKGSPGFPGEPG 

YPGLIGRQGP\QGEKGEAGPPGPPGIVIGTGPLGEKGERGYPGT 

PGPRGEPGPKGFPGLPGQPGPPGLPVPGQAGAPGFPGERGEKGD 

RGFPGTSLPGPSGRDGLPGPPGSPGPPGQPGYTNGIVECQPGPP 

GDQGPPGIPGQPGFIGEIGEKGQKGESCLICDIDGYRGPPGPQG 

PPGEIGFPGQPGAKGDRGLPGRDGVAGVPGPQGTPGLIGQPGAK 



373 



WO 01/53312 



PCT/US00/34263 



1 SEQ 
1 ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lyoine, 
L= Leucine, M»Methionine , N«Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
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GEPGEFYFDLRLKGDKGDPGFPGQPGMPGRAGSPGRDGHPGLPG"~ 

PKGSPGSVGLKGERGPPGGVGFPGSRGDTGPPGPPGYGPAGPIG 

DKGQAGFPGGPGSPGLPGPKGEPGKIVPLPGPPGAEGLPGSPGF 

PG PQGDRG F PGT P GR \ PG L \ PGE KGAVG \ QPGIGFPGPPGP KG V 

DGLPGDMGPPGTPGRPGFNGLPGNPGVQGQKGEPGVGLPGLKGL 

PGLPGIPGTPGEKGSIGVPGVPGEHGAIGPPGLQGIRGEPGPPG 

LPGSVGSPGVPGIGPPGARGPPGGQGPPGLSGPPGIKGEKGFPG 

FPGLDMPGPKGDKGAQGLPGITGQSGLPGLPGQQGAPGIPGFPG 

SKGEMGVMGTPGQPGSPGPWGAPGLPGEKGD\HGFPGSSGPRGD 

PGLKGDKGDVGLPGKPGSMDKVYMGSMKGQKGDQGEKGQIGPIG 

EKGSRGDPGTPGVPGKDGQAGQPGQPGPKGDPGISGTPGAPGLP 

GPKGSVGGMGLPGTPGEKGVPGIPGPQGSPGLPGDKGAKGEKGQ 

AGPPGIGIPGLRGEKGDQGIAGFPGSPGEKGEKGSIGIPGMPGS 

PGLKGSPGSVGYPGSPGLPGEKGDKGLPGLDGIPGVKGEAGLPG 

TPGPTGPAGQKGEPGSDGIPGSAGEKGEPGLPGRGFPGFPGAKG 

DKGSKGEVGFPGLAGSPGIPGSKGEQGFMGPPGPQGQPGLPGSP 

GHATEGPKGDRGPQGQPGLPGLPGPMGPPGLPGIDGVKGDKGNP 

GWPGAPG VPG P KGD PG FQGM PG IGGS PG I TGS KGDMGPPG VPGF 

QG PKG L PGLQG I KGDQGDQ G VPGA KG L PGP PGP PGP YD 1 1 KG EP 

GLPGPEGPPGLKGLQGLPGPKGQQGVTGLVGIPGPPGIPGFDGA 

PGQKGEMGPAGPTGPRGFPGPPGPDGLPGSMGPPGTPSVDHGFL 

VTRHSQTIDDPQCPSGTKILYHGYSLLYVQGNERAHGQDLGTAG 

SCLRKFSTMPFLFCNINNVCNFASRNDYSYWLSTPEPMPMSMAP 

ITGENIRPFISRCAVCEAPAMVMAVHSQTIQIPPCPSGWSSLWI 

G YS FVMHTS AGAEGSGQALAS PGSCLEEFRS AP FI ECHGRGTCN 

YYANAYS FWLATIERSEMFKKPTPSTLKAGELRTHVSRCQVCMR 

RT 


5795 ' 


1192 


61 


STRSPTVEYISAHPHILFMLLKGYEAPQIALRCGIMLRECIRHE " 
P LAKI I L FSNQFRD FFKYVELS TFD IAS DAFATFKDLLTRHKVL 
VADFLEQNYDTIFEDYEKLLQSENYVTKRQSLKLLGELILDRHN 
FA I MTKY I S KPENLKLMMNLLRDKS PN I QFEAFHVFKVFVAS PH 
KTQPIVEILLKNQPKLIEFLSSFQKERTDDEQFADEKNYLIKQI 
RDLKKTAP * RALRDS KR 


5796 


2 


1078 


GRVGWELWCMYISPPKDWWDAGDPSLPIRTPAMIGCSFWNRKF" 
FGEIGLLDPGMDVYGGENIELGIKVWLCGGSMEVLPCSRVAHIE 
RKKKPYNSNIGFYTKRNALRVAEVWMDDYKSHVYIAWNLPLENP 
GIDIGDVSERRALRKSLKCKNFQWYLDHVYPEMRRYNNTVAYGE 
LRNNKAK DVCLDQG P LENHTAI L Y P CHGWG PQ LAR YTKEG FIjHL 
G ALGTTTLL P DTRCLVDNS KS RL PQLLD CDKVKS S L YKRWN F I Q 
NGA I MNKGTGRCLE VENRGLAG I DL I LRS CTGQRWT I KNS I K* R 

EGAGALEPGPQDMAAPPNIWTSCPGGETARGRQVLDGPPRASPG 
QHRDPG 


£797 


2 


891 


PRVRQKTLVDVTLENSNIKDQIRNLQQTYEASMDKLREKQRQLE " 

VAQVENQLLKMKVESSQEANAEVMREMTKKLYSQYEEKLQEEQR 

KHSAE KE ALLEETNS FLKAI EE ANKKMQAAE I SLEEKDQR I GEL 

DRL I ERME KERHQLQLQLLEHETEMS G ELTDS DKE R YQQLE E AS 

ASLRERIRHLNDMVHCQQKKVKQMVEEIESLKKKLQQKQLLILQ 

LLEKISFLEGENNELQSRLDYLTETQAKTEVETREIGVGCDLLP 

SQTGRTREIVMPSRNYTPYTRVLELTMKKTLT 


5798 


644 


115 


KILGSRWKSMSNQEKQPYYEEQARLSKIHLEKYPNYKYKP'RPKR 
TC I VDG K KLR I GE Y KQ LMR S RRQEMRQ FFT VGQQ PQ I P I TTGTG 
WYPGAITMATTTPSPQMTSDCSSTSASPEPSLPVIQSTYGMKT 
DGGS LAGNE M I NGEDEME M YDD YEDD P KSD YS S ENE APE AVS AN 


5799 


2679 


1435 


LLSTYIKFINLFPETKATIQGVLRAGSQLRNADVELQQRAVEYL " 
TLS S VAS TD VLATVLEEM P P FPERES S I LAKLKRKKGPGAGSAL 
DDGRRDPSSND INGGME PTPSTVSTPS PS ADLLGLRAAP P P AAP 
PAS AGAGNLL VD VFDG P AAQ PS LG PT P EE AFLS PG P EDIG P P I P 
EADELLNKFVCKNNGVL FENQLLQ I G VKSEFRQNLGRMYLFYGN 
KTSVQFQNFSPTWHPGDLQTQLAVQTKRVAAQVDGGAQVQQVL 
NIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 
AAQDFFQRWKQLSLPQQEAQKIFKANHPMDAEVTKAKLLGFGSA 
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LLDNVD PNPENFVGAG 1 1 QTKALQVGCLLRLE PNAQAQM YRLTL 
RTSKEPVSRHLCELLAQQF 


5800 


2679 


1435 


LLSTYIKFINLFPETKATIQGVLRAGSQLRNADVELQQRAVEYL 
TLSSVASTDVLATVLEEMPPFPERESSILAKLKRKKGPGAGSAL 
DDGRRDPSSNDINGGMEPTPSTVSTPSPSADLLGLRAAPPPAAP 
PASAGAGNLLVDVFDGPAAQPSLGPTPBEAFLSPGPEDIGPPIP 
EADELLNKFVCKNNGVLFENQLLQIGVKSEFRQNLGRMYLFYGN 
KTSVQFQNFSPTWHPGDLQTQLAVQTKRVAAQVDGGAQVQQVL 
NIECLRDFLTPPLLSVRFRYGGAPQALTLKLPVTINKFFQPTEM 
AAQDFFQRWKQLSLPQQEAQKIFKANHPMDAEVTKAKLLGFGSA 
LliDNVDPNPENFVGAG 1 1 QTKALQVGCLLRLE PNAQAQMYRLTL 
RTSKEPVSRHLCELLAQQF 


5801 


3 


1413 


FPRLYHLIPDGEITSlKINRVDPSESLSIRLVGGSETPLVHIII 
QHIYRDGVIARDGRLLPGDIILKVNGMDISNVPHNYAVRLLRQP 
CQVLWLTVMREQKFRSRNNGQAPDAYRPRDDSFHVILNKSSPEE 
QLGI KLVRKVDEPGVFI FNVLDGGVAYRHGQLEENDRVLAINGH 
DLRYGSPESAAHLIQASERRVHLWSRQVRQRSPDIFQEAGWNS 
NGSWSPGPGERSNTPKPLHPTITCHEKWNIQKDPGESLGMTVA 
GGASHRE WDLP I YVI S VE PGG V I SRDGR I KTGD I LLNVDGVELT 
BVSRSEAVALLKRTSSSIVLKALEVKEYEPQEDCSSPAALDSNH 
NMAPPSDWSPSWVMWLELPRCLYNCKDIVLRRNTAGSLGFCIVG 
GYEEYNGNKPFFIKSIVEGTPAYNDGRIRCGDrLLAVNGRSTSG 
MIHACLARLLKELKGRITLTIVSWPGTFL 


5802 


3 


290 


CFSLYQIMERIMDLPTLLRHAFREMFSVGGLFWMFRIRIILCLM 
GAFFYLISPLDFVPEALFGILGFLDDFFVIFLLLIYISIMYREV 
ITQRLTR 


5803 


2234 


1299 


EAQFGTTAEI YAYREEQDFGIEIVKVKAIGRQRFKVLELRTQSD ' ' 
G1QQAKVQILPECVLPSTMSAVQLESLNKCQIFPSKPVSREDQC 
SYKWWQKYQKRKFHCANLTSWPRWLYSLYDAETLMDRIKKQLRE 
W D ENLKDDS LPS NP I D FS YRVAACL P I DD VLR I Q LLKI G S AI QR 
LRCELDIMNKCTSLCCKQCQETEITTKNEIFSLSLCGPMAAYVN 
PHGYVHETLTVYKACNLNLIGRPSTEHSWFPGYAWTVAQCKICA 
SHIGWKFTATKKDMSPQKFWGLTRSALLPTIPDTEDEISPDECVI 
LCL • 


5804 


2 


1707 


EMEKQRQEEQRKRTEEERKRRIEQDMLEKRKIQRELAKRAEQIE 
DINNTGTESASEEGDDSLLITWPVKSYKTSGKMKKNFEDLEKE 
REEKERIKYEEDKRIRYEEQRPSLKEAKCLSLVMDDEIESEAKK 
ESLSPGKLKLTFEELERQRQENRKKQAEEEARKRLEEEKRAFEE 
ARRQMVNEDEENQDTAKIFKGYRPGKLKLSFEEMERQRREDEKR 
KAEEEARRRIEEEKKAFAEARRNMWDDDSPEMYKTISQEFLTP 
GKLEINFEELLKQKMEEEKRRTEEERKHKLEMEKQEFEQLRQEM 
GEEEEENETFGLSREYEELIKLKRSGSIQAKNLKSKFEKIGQLS 
EKEIQKKIEEERARRRAIDLEIKEREAENFHEEDDVDVRPARKS 
EAPFTHKVNMKARFEQMAKAREEEEQRRIEEQKLLRMQFEQREI 
DAALQKKREEEEEEEGSIMNGSTAEDEEQTRSGAPWFKKPLKNT 
SWDSEPVRFTVKVTGEPKPEITWWFEGEILQDGEDYQYIERGE 
TYCLYLPETFPEDGGEYMCKAVNNKGSAASTCILTIESKN 


5805 


3 


776 


Y I S DTLGQVY KS K I R W W I E ENGGNGN I S VDDL I ALLDLAEHAS S 
AFKESQQQSEDREYEVKERLYPKSKRRYDTYNIAGYQGEIEVGL 
YTIQILQLIPFFDNKNELSKRYMVNFVSGSSDIPGDPNNEYKLA 
LKN Y I P YLT KLKFS L KKS FD FFD E Y FVLLKP RNN I KQNEE AKTR 
RKVAG YFKKYVD I FCLLEESQNNTGLGSKFSEPLQVERCRRNLV 
ALKADKFSGLLEYLIKSQEDAISTMKCIVNEYTFLLK 


5806 


1257 


877 


AVFTFHNHGRTANLYSLHSWLGITTVFLFACQRFLGFAVFLLPW 
ASMWLRSLLKPIHVFFGAAILSLS IASVISGINEKLFFSLKNTT 
RP YHS LPS EAVFANSTG M L WAFG LLVL Y ILLAS S WKR P 


5807 


2267 


1302 


RFS KKTFRRPMAVDIQPACLGLYCGKTLLFKNGSTE I YGECGVC 
PRGQRTNAQ K Y CQ P CTE S P E L YDWL YLG FMAML P LVLHW F F I E W 
YSG KKS S SAL FQH I TAL FE CSMAA I 1 TLLVSD P VG VL Y IRS CR V 
LMLSDWYTMLYNPSPDYVTTVHCTHEAVYPLYTJVFIYYAFCLV 
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P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknowxi, +=Stop 
Codon, /=possible nucleotide deletion, 
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LMMLLR P LLVKK I ACG LGKS D R PKS I YAALY F FP I LTV LQAVGG 
GLL YY A F P Y 1 1 LVLS L VTLAVYM S AS E I ENC YDLL VR K KRL I VL 

FSHWLLHAYG I I S I S RVDKLEQDLPLLALVPTPALF YLFTAKFT 
EPSRILSEGANGH 


5808 


2 


433 


S L P DSG WE Y LSNGG VADNH KD FGE LR YNE CLMNFS CNG KNG S S 
EGRITHGFQLKSAYENNLMPYTNYTFDFKGVIDYIFYSKTHMNV 
LG VLG P L D P Q WLVENN I TGC PH PH I PS DHFS LLTQL ELH P PLLP 
LVNGVHLPNRR 


5809 


464 


2422 


ILVPGFQGILHPGVYCALQSQHQAQELVADIDECEVSGLCRHGG 
R CVNTHGS F E CYCMDG YLPRNG P E P FH P TTDATS CTE I DCGTP P 
EVPDGYIIGNYTSSLGSQVRYACREGFFSVPEDTVSSCTGLGTW 
ESPKLHCQEINCGNPPEMRHAILVGNHSSRLGGVARYVCQEGFE 
S PGG K I TS VCTEKG T W RES TLT CTE ILTKINDVSLFNDTCVRWQ 
INSRRINPKISYVISIKGQRLDPMESVREETVNLTTDSRTPEVC 
LALYPGTNYTVNISTAPPRRSMPAVIGFQTAEVDLLEDDGSFNI 
S I FNETCLKLNRRSRKVGSEHMYQFTVLGQRWYLANFSHATSFN 
FTTR EQVP WCLDL Y PTTD YTVNVTLLRS PKRHSVQ I T I ATP PA 
VKQTISNISGFNETCLRWRSIKTADMEEMYLFHIWGQRWYQKEF 
AQEMTFNISSSSRDPEVCLDLRPGTNYNVSLRALSSELPW1SL 
TTQ ITEPPLPEVEF FTVHRGPL PRLRLR KAKE KNG PISS YQVLV 
LPLALQSTFSCDSEGASSFFSNASDADGYVAAELLAKDVPDDAM 
EIPIGDRLYYGEYYNAPLKRGSDYCIILRITSEWNKVRRHSCAV 
WAQVKDSSLMLLQMAGVGLGSLAWIILTFLSFSAV 


5810 


3 


1641 


KVFGTHKDHEVSTLDTAI SAVKVQLAEFLENLQEKSLR I EAFVS ' 

E I E S FFNT I EENCS KNE KRLEEQNEEMMKKVLAQYDE KAQSFEE 

VKKKKMEFLHEQMVHFLQSMDTAKDTLETIVREAEELDEAVFLT 

SFEEINERLLSAMESTASLEKMPAAFSLFEHYDDSSARSDQMLK 

QVAVPQPPRLEPQEPNSATSTTIAVYWSMNKEDVIDSFQVYCME 

EPQDDQEVNELVEEYRLTVKESYCIFEDLEPDRCYQVWVMAVNF 

TGCSLPSERAIFRTAPSTPVIRAEDCTVCWNTATIRWRPTTPEA 

TETYTLEYCRQHSPEGEGLRSFSGIKGLQLKVNLQPNDNYFFYV 

RAINAFGTSEQSEAALISTRGTRFliLLRETAHPALIIISSSGTVI 

SFGERRRLTEIPSVLGEELPSCGQHYWETTVTDCPAYRLGICSS 

SAVQAGALGQGETSWYMHCSEPQRYTFFYSGIVSDVHVTERPAR 

VGILLDYNNQRLIFINAESEQLLFIIRHRFNEGVHPAFALEKPG 

KCTLHLG I E P P D S VRH K 


5811 


1918 


851 


AAALADPLPEDKWSAEKRRPLKSSLGYElTFSLLNPDPKSHDVY 
WD I EGAVRR YVQ P FLNALGAAGN FS VDS Q I L YYAM LGVN PRFDS 
ASSSYYLDMHSLPHVINPVESRIiGSSAASLYPVLNFLLYVPEIiA 
HSPLYIQDKDGAPVATNAFHSPRWGGIMVYNVDSKTYNASVLPV 
RVEVDMVRVMEVFLAQLRLLFGIAQPQLPPKCLLSGPTSEGLMT 
WELDRLLWARSVENLATATTTLTSLAQLLGKISNIVIKDDVASE 
VYKAVAAVQKSAEELASGHLASAFVASQEAVTSSELAFFDPSLL 
HLLY FPDDQKFAI YI PLFLPMAVP I LLSLVKI FLETRKS WRKPE 
KTD 


5812 


5204 


2744 


GGRQRCQRGRSCGAREEEVEPGTARPPPAASAMDASLEKIADPT 
LAEMGKNLKEAVKMLEDSQRRTEEENGKKLISGDIPGPLQGSGQ 
DMVS I LQLVQNLMHGDE DE E P QS PR I QN I GEQGHMALLGHS LGA 
YISTLDKEKLRKLTTRILSDTTLWLCRIFRYENGCAYFHEEERE 
GLAKI CRLAIHS R YEDF WDGFNVL YNKKPV I YLS AAARPGLGQ 
YLCNQLGLPFPCLCRVPCNTVFGSQHQMDVAFLEKLIKDDIERG 
RL PLLLVANAGTAAVGHTDKIGRLKELCEQYG I WLHVEGVNLAT 
LALGYVSSSVLAAAKCDSMTMTPGPWLGLPAVPAVTLYKHDDPA 
LT L VAG LTSNKP TDKLRALP LWLS LQ YLGLDG F VE R I KHACQLS 
Q RLQE S LKKVN Y I K I L VEDE LS S PVWFR FFQEL PGS D PVFKAV 
PVPNMTPSGVGRERHSCDALNRWLGEQLKQLVPASGLTVMDLEA 
EGTCLRFS PLMTAAVLGTRGEDVDQLVAC I ESKLP VLC CTLQLR 
EEFKQEVEATAGLLYVDDPNWSGIGWRYEHANDDKSSLKSYPQ 
GENIHAGLLKKLNELESDLTFKIGPEYKSMKSCLYVGMASDNVH 
AAELVET I AATARE I EDNS RLLENMTEWRKGIQEAQVELQKAS 
E ERL LE E G VLRQ I P WGS VLNW FS P VQ ALQ KGRT FNLTAGS LES 
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TEPIYVYKAQGAGVTLPPTPSGSRTKQRLPGQKPPKRSLRGSDA 
LSETSSVSHIEDLEKVERLSSGPEQITLEASSTEGHPGAPSPQH 
TDQTEAFQKGVPHPEDDHSQVEGPESLR 


5813 


2936 


699 


HRDGVSGSLERPLTDRSRTGAFAQQRGKMATAGGGSGADPGSRG 
LLRLLSFCVLIAGLCRGNSVERKIYIPLNKTAPCVRLLNATHQI 
G CQ S S I S GDTG V I HWE KE EDLQW VLTDG PNP P YMVLLES KHFT 
RDLMEKLKGRTSRIAGLAVSLTKPSPASGFSPSVQCPNDGFGVY 
SNS YGPE FAHCRE I QWNS LGNGLAYEDFS FP I FLLEDENETKVI 
KQCYQDHNLSQNGSAPTFPLCAMQLFSHMAWLSFSTAT\CMRRS 
SIQSTFSINPKIVCDPLSDYNVWSMLKPINTTGTLKPDDRVWA 
ATRLDSRSFFWNV\APGAESAVASFVTQLAAAEALQKAPDVTTL 
PRNVMFVFFQGETFDYIGSSRMVYDMEKGKFPVQLENVDSFVEL 
GQ VALRTS L E LWMHTD P VS QKNE S VRNQ VE DLLATLE KS G AG V P 
AVILRRPNQSQPLPPSSLQRFLRARNISGWLADHSGAFHNKYY 
QSIYDTAENINVSYPEWLEPLKE/ETMNFG+QDTAKALADVATV 
LGRALYELAGGTNFSDTVQADPQTVTRLLYG\FLIKANNSWFQS 
I LQGRDLRS YLG * RGL FQH \ YI AV \ S S PTNTI YV/ VLQYALANL 
TGTWNLTREQCQDPSKVPSENKDLYEYSWVQGPLHSNETDRLP 
RCVRSTARLARALSPAFELSQWSSTEYSTWTESRWKDIRARIFL 
IASKELELITLTVGFGILIFSLIVTYCINAKADVLFIAPREPGA 
VSY 


5814 


8500 


432 


ALKCRPRRVLAILVGFVQPDRMAEEGAVAVCVRVRPLNSREESL 

GETAQV YW KTHNNVI Y P VDG S KSFN FDRVLHGNET PKNVYEA\ I 

AAPIIDSAIQGYNGTIFA\YGQT\ASGKTYTMMGSEDHLGVIPQ 

GQFHGHFSQKI*EVFLDREFLLRVSYMEIYNETITDLLCGTQKM 

KPLIIREDVNRNVYVADLTEEWYTSEMALKWITKGEKSRHYGE 

TKMNQRS SRS HT IFRMILESRE KGE PS NCEG S VKVS HLNL VDLA 

GS E RAAQTG AAG VRLKEG CN I NRS L F I LGQ VI KKL S DGQ VGG F I 

NYRDS KLTR I LQNSLGGNPKTRI I CT I TP VS FDE TLTALQFAST 

AKYMKNTPYVNEVSTDEALLKRYRKEIMDLKKQLEEVSLETRAQ 

AM E KDQLAQLL E E KDLLQ K VQNE K I ENLTRM L VT S S S LTLQQ EL 

KAKRKRRVTWCLGKINKMKNSNYADQFNIPTNITTKTHKLSINL 

LREIDESVCSESDVFSNTLDTLSEIEWNPATKLLNQENIESELN 

SLRADYDNLVLDYEQLRTEKEEMELKLKEKNDLDEFEALERKTK 

KDQEMQLIHEISNLKNLVKHREVYNQDLENELSSKVELLREKED 

Q I KKLQ E Y I DS Q KLEN I KMDLS YS LES I EDPKQM KQTLFDAE TV 

ALDAKRESAFLRSENLELKEKMKELATTYKQMENDIQLYQSQLE 

AKKKMQVDLEKELQSAFNEITKLTSLIDGKVPKDLLCNLELEGK 

I TDLQKE LNKEVE ENEALRfi E VI LLS ELKS LPS EVERLRKE IQD 

KSEELHIITSEKDKLFS EWHKES R VQGLLE E I G KTKDDLATTQ 

SNYKSTDQEFQNFKTLHMDFEQKYKMVLEENERMNQEIVNLSKE 

AQKFDSSLGALKTELSYKTQELQEKTREVQERLNEMEQLKEQLE 

NRDS P LQTVE RE KTL I TEKLQQTLEE VKTLTQEKDDLKQLQ E S L 

Q I ERDQLKS D I H DT VNMNI DTQEQLRNALE S LKQHQE T INTL KS 

KISEEVSRNLHMEENTGETKDEFQQKMVGIDKKQDLEAKNTQTL 

TADVKDNEI IEQQRKI FSL I QEKNELQQML ES V I AEKEQLKTDL 

KENIEMTIENQEELRLLGDELKKQQEIVAQEKNHAIKKEGELSR 

TCDRLAE VEEKLKE KS QQLQEKQQQLLNVQEEMS EMQKKINB I E 

NLKNELKNKELTLEHM E TERLE LAQKLNEN YEE VKS I TKERKVL 

KELQKSFETERDHLRGYIREIEATGLQTKEELKIAHIHLKEEQE 

TIDELRRSVSEKTAQIINTQDLEKSHTKLQEEIPVLHEEQELLP 

NVKKVSETQETMNELELLTEQSTTKDSTTLARIEMERLRLNEKF 

QESQEEIKSLTKERDNLKTIKEALEVKHDQLKEHIRETLAKIQE 

SQS KQEQS LNMKEKDNETTKI VS EMEQFKPKDSALLR I E I EMLG 

LS KRLQESHDEMKSVAKEKDDLQRLQEVLQS ESDQLKENI KE I V 

AKHLETEEELKVAHCCLKEQEETINELRVNLSEKETEISTIQKQ 

LEAINDKLQNKIQEIYEKEEQLNIKQISEVQEKVNELKQFKEHR 

KAKDSALQSIESKMLELTNRLQESQEEIQIMIKEKEEMKRVQEA 

LQ I ERDQLKENTKE I VAKMKES QEKE YQFLKMTAVNETQEKMCE 

I EH L KEQFE TQKLNLEN I E TEN I RLTQ I LHENLE EMRS VTKERD 

DLRSVEETLKVERDQLKENLRETITRDLEKQEELKIVHMHLKEH 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W^Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /--possible nucleotide deletion, 
\=possible nucleotide insertion) 








QETIDKLRGIVSEKTNEISNMQKDLEHSNDALKAQDLKIQEELR 
IAHMHLKEQQETIDKLRGIVSEKTDKLSNMQKDLENSNAKLQEK 
IQELKANEHQLITLKKDVNETQKKVSEMEQLKKQIKDQSLTLSK 
LEIENLNLAQKLHENLEEMKSVMKERDNLRRVEETIiKLERDQLK 
ESLQETKARDLEIQQELKTARMLSKEHKETVDKLREKISEKTIQ 
I S D I QKDLD KS KDE LQ KKI QELQ KKELQLLRVKED VNMSHKK I N 
EMEQLKKQFEPNYLCKCEMDNFQLTKKLHESLEEIRIVAKERDE 
LRRIKESLKMERDQFIATLREMIARDRQNHQVKPEKRLLSDGQQ 
HLMESLREKCSRIKELLKRYSEMDDHYECLNRLSLDLEKEIEFH 
RIMKKLKYVLSYVTKIKEEQHECINKFEMDFIDEVEKQKELLIK 
IQHLQQDCDVPSRELRDLKLNQNMDLHIEEILKDFSESEFPSIK 
TEFQQVLSNRKEMTQFLEEWLNTRFDIEKLKNGIQKENDRICQV 
NNFFNNRIIAIMNESTEFEERSATISKEWEQDLKSLKEKNEKLF 
KNYQTLKTSLASGAQVNPTTQDNKNPHVTSRATQLTTEKIRELE 
NSLHEAKESAMHKESKI I KMQKELEVTNDI IAKLQAKVHESNKC 
LEKTKETIQVLQDKVALGAKPYKEEIEDLKMKU3KIDLEKMKNA 
KEFEKEISATKATVEYQKEVIRLLRENLRRSQQAQDTSVISEHT 
DPQPSNKPLTCGGGSG1VQNTKALILKSEHIRLEKEISKLKQQN 
EQLI KQKNELLSNNQHL SNE VKTWKERTLKREAHKQVTCENS PK 
SPKVTGTASKKKQITPSQCKERNLQDPVPKESPKSCFFDSRSKS 
LPSPHPVRYFDNSSLGLCPEVQNAGAESVDSQP\GPWARLFQGK 
DVP\ECKTQ s 


5815 


23 


1460 


SELVMWTVQNRESLGLLSFPVMITMVCCAHSTNEPSNMSYVKET 
VDRLLKGYD I RLRPDFGGP PVDVGMRIDVAS I DMVSEVNMDYTL 
TMYFQQSWKDKRLSYSGIP LNLTLDNR VADQLWVP DT Y FLNDKK 
SFVHGVTVKNRMIRLHPDGTVLYGLRITTTAACMMDLRRYPLDE 
QNCTLEIESYGYTTDDIEFYWNGGEGAVTGVNKIELPQFSIVDY 
KMVSKKVEFTTGAYPRLSLSFRLKRNIGYFILQTYMPSTLITIL 
SWVS FWINYDASAARVALGI TTVLTMTTISTHLRETLPKI P YVK 
AIDIYLMGCFVFVFLALLEYAFVNYIFFGKGPQKKGASKQDQSA 
NEKNKLEMNKVQVDAHGNILLSTLEIRNETSGSEVLTSVSDPKA 
TMYSYDSASIQYRKPLSSRE\A*GRAPDRHGVPSKGRIRRRAS\ 
QLKVKI PDLTDVNS IDKWSRMFFPITFSLFNWYWLYYVH 


5816 


861 


191 


TSSRSRAAAQEGDAETPGSVERRGRRAGAEDGMSQAPGAQPSPP 
TVYHERQRLELCAVHALNNVLQQQLFSQEAADEICKRLAPDSRL 
NPHRSLLGTGNYDVNVIMAALQGIX3LAAVWWDRRRPLSQLALPQ 
VLGL I LNLPS PVS LGLLS LPLRRRHLRWPCARL / VTVS Y YNLDS 
K\LRAPEGPGGLRTE\*GPFLAAALAQGLCEVLLWTKEVEEKG 
SWLRTD 


5817 


851 


118 


RLFRGPGANRGRSCRGCSGGREPSGGALPKRHCPC*PPSPPAAD ' 
VMSNTTVPNAPQANS DSM VGY VLGP FFL I TLVG WVAWM YVQK 
KKRVDRLRHHLLPMYSYDPAEELHEAEQELLSDMGDPKW\QAG 
RVATSTSGCHCWMSRRDLTPLPHPSEPGVLDCLGPCHLLPLLSP 
GSPCWVLGLHFSLHPPSAASASHALTITSLPPGLLPFVGVELTA 
H PQ ALMGRG F P S GMAAAGRHLC FL 


5818 


3 


3918 


QAliRDKLWIFLVQSFYAVRHTESWKLMSTDbQQklQAAAFDKGD 
DRRLGKKPIFSSSQQRKQVSDSGDIKIKSWRGNNKKECWSYLST 
NKKMKS DG LG AS GHS S S TNRNS I N KTLKQDD VKE KDGTKI AS K I 
TKELKTGGKNVSGKPXTVTKSKTENGDKARLENMSPRQWERSA 
TAAAAATGQKNLLNGKG VRNQEGQ I SGARPKVLTGNLNVQAKAK 
PLKJCATGKDSPCLSIAGPSSRSTDSSMEFSISTECLDEPKENGS 
TEEEKPSGHKLS FCDS PGQMMKNSVDSVKNST VAI KSRPVSRVT 
NGTSNKKS I HEQDTNVNNSVLKKVSGKGCSEP VPQAI LKKRGTS 
NGCTAAQQRTKSTPSNLTKTQGSQGESPNSVKSSVSSRQSDENV 
AKLDHNTTTE KQAP KR KM VKQ VHTALP KVNAKI VAM PKNLNQS K 
KGETLNNKDSKQKMPPGQVISKTQPSSQRPIiKHETSTVQKSKFH 
DVRDNNNKDS VS EQKPHKPL INLAS EIS DAEALQS S CRP \DPQK 
PLNIX3EKEKLALECQNISKLDKSLKHELESKQICLDKSETKFPN 
HKETDDCDAANICCHSVGSDNVNSKFYSTTAIiKYMVSNPNENSIj 
NSNPVCDLDSTSAGQIHLISDRENQVGRKDTNKQSSIKCVEDVS 
LCN P ERTNGTLNSAQED KKS KVP VEGLT I PS KLSDES AMDEDKH | 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M*Methionine, N«Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyro sine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ATADSDVSSKCFSGQLSEKNSPKNMETSESPESHETPETPFVGH 
WNL STGVLHQRESPES DTGS ATT S S DD I KPRS ED YDAGGS Q DDD 
GSNDRGISKCGTMLCHDFLGRSSSDTSTPEELKIYDSNLRIEVK 
MKKQSSNDLFQVNSTSDDEIPRKRPEIWSRSAIVHSRERENIPR 
GSVQFAQEIDQVSSSADETEDERSEAENVAENFSISNPAPQQFQ 
GIINLAFEDATENECREFSANKKFKRSVLLSVDECEELGSDEGE 
VHTPFQASVDSFSPSDVFDGISHEHHGRTCYSRFSRESEDNILE 
CKQNKGNSVCKNESTVLDLSSIDSSRKNKQSVSATEKKNTIDVL 
SSRSRQLLRED KKVNNGSNVEND I Q QRS K FLDS DVKS QER P CHL 
DLHQREPNSDIPKNSSTKSLDSFRSQVLPQEGPVKESHSTTTEK 
AN I ALS AGD I DDCDTLAQTRM YDHR P S KTLS P I YEMD V I EAFEQ 
KVESETHVTDMDF*DDQHFAKQDWTLLKQLLSEQDSNLDVTNSV 
PEDLSLAQYLINQTLLLARDSSKPQGITHIDTLNRWSELTSPLD 
S SAS I TMAS FS S EDCS PQGEWT I LE LETQH 


5819 


1 


5557 


AAAGLLGALHLVMTL WAAARAE KEAFVQSES 1 1 E VLRFDDGGL 

LQTETTLG LS S YQQKS I S L YRGNCR P I R FE P PMLDFHEQ P VGM P 

KMEKVYLHNPSSE+TITLVSIFATTSHFHASFFQNRKILPGGNT 

S FDVS /VFLARWGNVENTLFINTSNHGVFTY\QVFGVGVPNP Y 

RLRPFLGARVTVNSSFSPIINIHNPHSEPLQWEMYSSGGDLHL 

ELPTGQQGGTRKLWEIPPYETKGVMRASFSSREADNHTAFIRIK 

TNASDSTE F 1 1 L P VEVEVTTAPG I YS S TEMLD FGTLRTQDLPKV 

IjNTiHLLNSGTKDVPITSVRPTPQ\NDAITVHFKPITLKAS\ESK 

YTKVASISFDASKAKKPSQFSGKITVKAKEKSYSKLEIPYQAEV 

LDGYLGFTDHAATLFHIRDSPADPVERPIYLTNTFSFAILIHDVL 

LPEEAKTMFKVHNFSKPVLILPNESGYIFTLLFMPSTSSMHIDN 

N I LL I TNAS KFH L P VR VYTG FLD Y F VL PPK1EERFID FG VLS AT 

EASNILFAIINSNPIELAIKSWHIIGDG\LSIELVAVDRGNRTT 

I I S S LPECEKSS S SDQS S VTLASG YF \ AVFRVKLTAKKL \ EGI H 

DGAI QI TTDYE I LTI PVK\ AVI AVGSLTCSPKHWLPPS FPGKI 

VHQSLNIMNSFSQKVKIQQIRSLSEDVRFYYKRLRGNKEDLEPG 

KKSKIANIYFDPGLQCGDHCYVGLPFLSKSEPKVQPGVAMQEDM 

WDADWDLHQSLFKGWTGIKENSGHRLSAIFEVNTDLQKNIISKI 

TAELS W PS ILSS PRHLKFPLTNTNCS S \EEE ITLENP/SQDVPV 

YVQFIPLALYSNPSVFVDKLVSRFNLSKVAKIDLRTLEFQVFRN 

SAHPLQSSTGFMEG\LSPHLILNLILKPGEKKSVKVK\FTPVHN 

RTVSS L I I VRNNLTVMDAVMVQGQGTTENLR VAGKLPGPGS SLR 

FKITEALLKDCTDSLKLREPNFTLKRTFKVENTGQLQIHIETIE 

ISGYSCEGYGFKWNCQEFTLSANASRDIIILFTPDFTASRVIR 

ELKFITTSGSEFVFILNASLPYHMLATCAEALPRPNWELALYII 

ISGIMSALFLLVIGTA\YLEAQGIWEP\FRRRLS\FEASNPPFD 

VGR P FDLRRI VG I S S EGNLNT LS CD PGHSRGFCGAGGS S S R PSA 

GSHKQ*GPSGHPHSSHSNRNSADVDDVRAYNSGRTSSMTSAQAA 

SSQPANKTRPLVLDSNTGAQGHSAGRKSKGAKQSQHGSQHHAHS 

PLEQHPQPPLPPPVPQPQEPQPERLSPAPLAHPSHPERASSARH 

SSEDSDITSLIEAMDKDFDHHDSPALEVFTEQPPSPLPKSKGKG 

KPLQRKVKPPKKQEEKEKKGKGKPQEDELKDSLADDDSSSTTTE 

TS NPDTE PLLKEDTEKQKGKQAMPEKHESE MS QVKQKSKKLLN I 

KKEIPTDVKPSSLELPYTPPLESKQRRNLPSKIPLPTAMTSGSK 

SRNAQKTKGTSKLVDNRPPALAKFLPNSQELGNTSSSEGEKDSP 

PPEWDSVPVHKPGSSTDSLYKLSLQTLNADXFLKQRQTSPTPAS 

PSPPAAPCPFVARGSYSSIVNSSSSSDPKIKQPNGSKHKLTKAA 

S L PGKNGN P TFAA VTAG YDKS PGGNGFAKVS S NKTGFS S S LG I S 

HAPVDSDGSDSSGLWSPVSNPSSPDFTPLNSFSAFGNSFNLTGE 

VFSKLGLSRSCNQASQRSWNEFNSGPSYLWESPATDPSPSWPAS 

SGSPTHTATSVLGNTSGLWSTTPFSSSIWSSNLSSALPFTTPAN 

TLAS IGLMGTENS PAPHAPSTS S PADDLGQTYNP WR I WS PTIGR 

RSSDPWSNSHFPHBN 


5820 


310 


1270 


RVSLSGPVSLGVLLCARSSTMGKRDNRVAYMNPIAMARSRGPIQ 
SSGPTIQ\ VI * IDQGLPGKK* KSN* KRKRK/DSKALAEFEEKMN 
ENWKKELEKHREKLLSGSESSSKKRQRKKKEKKKSW*\DSSSS\ 
SSSSDSSSSSSDSEDEDKKQGKRRKKKKNRSHKSSESSMSETES 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*=Alanine, C=Cysteine / D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G-Glycine, 
K=tHistidine, I = Isoleucine, K=Lysine, 
L«*Leucine, Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DSKDSLKKKKKSKDGTEKEKDIKGLSKKRKMYSEDKPLSSESLS 
ESEYIEEVRAKKKKSSEEREKATEKTKKKKKHKKHSKKKKKKAA 
SSSPDSP+H*EKSGFPYKESAMSEEISTVKTTTYLLKCMNFLVF 
GIIPGLFSSHSDATV 


5821 


179 


915 


KWRNQSWRWPKPGTNWMLSCSVCWRRVTWTGSVWMRKLGKHPQT 
PT/IKDCSIAATGKRPSARFPHQRRKKRRE^4DDGLAEGGPQRSN 
TYVIKLFDRSVDLAQFSENTPLYPICRAWMRNSPSVRERECSPS 
SPLPPLPEDEEG\SEVTNSKSR*CVQACPPTHTPGGQPKNACR\ 
SR I P S PLAALRMQGT P * RWS PFE PE PS PSTL I YRNMQR WKR I RQ 
RWKEASHRNQLRYSESMKILREMYERQ 


5822 


464 


4379 


QTLKEMPIVMARDLEETASSSEDEEVISQEDHPCIMWTGGCRRI 
PVLVFHADAILTKDNNIRVIGERYHLSYKIVRTDSRLVRSILTA 
HGFHEVHPSSTDYNLMWTGSHLKPFLLRTLSEAQKVNHFPRSYE 
LTR KDRL YKN 1 1 RMQHTHGFKAFHI LPQTFLLPAE YAE FCNS YS 
KDRGPWIVKPVASSRGRG\VYLINNPNQISLEENILVSRYINNP 
LLIDDFKFDVRLYVLVTSYDPLVIYLYEEGLARFATVRYDQGAK 
NIRNQFMHLTNYSVNKKSGDYVSCDDPEVEDYGNKWSMSAMLRY 
LKQEGRDTTALMAHVEDLI I KTI I SAELAIATACKTFVPHRSSC 
FELYGFDVLIDSTLKPWLLEVNLSPSLACDAPLDLKIKASMISD 
MFTWGFVCQDPAQRASTRPIYPTFESSRRNPFQKPQRCRPLSA 
SDAEMKNLVGSAREKGPGKLGGSVLGLSMEEIKVLRRVKEENDR 
RGGFIRIFPTSETWEIYGSYLEHKTSMNYMLATRLFQDRMTADG 
APELKI*SLNSKAKLHAALYERKLLSLEVRKRRRRSSRLRAMRP 
KYPVITQPAEMNVKTETESEEEEEVALDNEDEEQEASQEESAGF 
LRENQAK YT PS LTAL VENTP KENSMK VREWNNKGGH CCKLE TQE 
L E P K FNLMQ I L QDNGNLS KMQ AR I AFS A YLQHVQ I \ R LM KDS GG 
QTFS AS W AAKE D EQM ELWR FLKRAS NNLQHS L RMV h PS RRLAL 
L ERTR I LAHQLG D F 1 1 VYN KE TEQMAE KKS KKKVEE E EEDG VNM 
ENFQEFIRQASEAELEEVLTFYTQKNKSASVFLGTHSKISKNNN 
NYSDSGAKGDHPETIMEEVKIKPPKQQQTTEIHSDKLSRFTTSA 
EKEAKLVYSNSSSGPTATLQKIPNTHLSSVTTSDLSPGPCHHSS 
LSQIPSAIPSMPHQPTILLNTVSASASPCLHPGAQNIPSPTGLP 
RCRSGSHTIGPFSSFQSAAHIYSQKLSRPSSAKAGSCYLNKHHS 
G I AKTQ KEGEDAS L YS KR YNQSMVTAELQRLAEKQAARQ YS PS S 
HINLLTQQVTNLNLATGI INRSSASAPPTLRPI ISPSGPTWSTQ 
SDPQAPENHSSSPGSRSLQTGGFAWEGEVENNVYSQATGVVPQH 
KYHPTAGSYQLQFALQQLEQQKLQSRQLLDQSRARHQAIFGSQT 
LPNSNLWTMNNGAGCRISSATASGQKPTTLPQKWPPPSSCASL 
VPKPPPNHEQVLRRATSQKASKGSSAEGQLNGLQSSLNPAAFVP 
ITSSTDPAHTKIMNHKHTEKQPVHHSWVHD 


5823 


42 


2293 


LLTALSMEGGGGRDEPSACRAGDVNMDDPKKEDILLLADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TSESPFAWSPLAGEKFVEVYKEAHLLALHIESSSRNQAAQAAKP 
EDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DSPLLGPPVGEPRLLASSPALPSSGAQARLTRAPGPPHSAHALP 
RESCTAHAASQAATQRKPGTKLLLPRAASVRGRGIPGAAEKPKK 
E I PAS PS R TK I PAE KES HRD VL PD KPAP GA VNVPAAGS HLGQG K 
RAIPVP\NKLGLKKTLLKAPGSYSN\LQRKSSSGA\VWSGASSA 
CTPQPVAKAKSSEFAS I PAN * LPGLCPNI SKS \GRMGPAMLRPA 
Ii\ PAGPVG \ AS S WQAKRVDVS ELAAEQLTAP? \ SAS PTQ PQTPE 
GGG \ QWLNS S CAWS E S S QLNKTRS I RRRDS CLNS KTKVM PTP TN 
QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL \ C VPARRRS SE PR KNSAMRTE PTRESNRKTDSR \ LVDVS PDR 
GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
EALLVD I KLE PLAVTPDAASQPL I DLPL I DFCDTPEAHVAVGSE 
SRPLIDLMTNTPDMNKNVAKPSPWGQLIDLSSPLIQLSPEADK 
ENVDSPLLKF 


5824 


42 


2293 


LLTALSMEGGGGRDEPSACRAGDVNMDDPKKEDILLLADEKFDF 
DLSLSSSSANEDDEVFFGPFGHKERCIAASLELNNPVPEQPPLP 
TS E S P FAW S P LAGEK FVE VYKEAH LIjALH I E S S S RNQAAQAAKP 



380 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M«Methionine, N*Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Yt=Tyrosine, X=Un known, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EDPRSQGVERFIQESKF\KINLFEKEKEMKKSPTSLKRETYYLS 
DS PLLG P P VG EPRLLAS S PALPS SGAQARLTRAPG P PHSAHALP 
RESCTAHAASQAATQRKPGTKLLLPRAASVRGRGIPGAAEKPKK 
EI PAS PSRTKI PAEKESHRDVLPDKPAPGAVNVPAAGSHLGQGK 
RAIPVP\NKLGLKKTLLKAPGSYSN\LQRKSSSGA\VWSGASSA 
CTPQPVAKAKSSEFASI PAN*LPGLCPNI SKS\GRMGPAMLRPA 
L\PAGPVG\ASSWQAKRVDVSELAAEQLTAPP\SASPTQPQTPE 
GGG\QWLNSSCAWSESSQLNKTRSIRRRDSCLNSKTKVMPTPTN 
QFKIPKFSIGDS\PDSSTPKLSRAQRPQSCTSVGRVTVHSTPVR 
RSSGPAPQSLLSAWRVSALPTPASRRCSGLPPMTPKTMPRAVGS 
PL\CVPARRRSSEPRKNSAMRTEPTRESNRKTDSR\LVDVSPDR 
GSPPSRVPQALNFSPEESDSTFSKSTATEVAREEAKPGGDAAPS 
EALLVDI KLE PLAVTPDAASQ PL I DLPL I D FCDTPEAHVAVGS E 
SRPLIDLMTNTPDMNKNVAKPSPWGQLIDLSSPLIQLSPEADK 
ENVDSPLLKF 


5625 


2 


4210 


FLQIESASPAPFSSGFLAAHPHSPGGSLATKGRSRLSAPGMLHL 

SAAPPAPPPEVTATARPCLCSVGRRGDGGKMAAAGALERSFVEL 

S G AER ER PRH FRE FT VCS IGTANAVAGAVK YS E S AGG F Y YVESG 

KLFSVTRNRFIHWKTSGDTLELMEESLDINLLNNAIRLKFQNCS 

VLPGGVYVS ETQNRVI I LMLTNQTVHRLLLPHPSRMYRS ELWD 

SQMQSIFTDIGKVDFTDPCNYQLIPAVPGISPKTSTASTAWLSSD 

GEALFALPCASGGIFVLKLPPYDIPGMVSWELKQSSVMQRLLT 

G WM PTAI RG DQS PSDR PLS LAVHCVEHDAF I FALCQDHKLRMWS 

YKEQMCLMVADMLEYVPVKKDLRLTAGTGHKLRLAYSPTMGLYL 

GIF\MHAPKRGQFCIFQLVSTESNRYSLDHISSLFTSQETLIDF 

ALTS TD I WAL WHDAENQT WKYINFEHNVAGQ WNPVFMQ PLPEE 

E I VI RDDQDPREMYLQS L FTPGQ FTNE ALCKALQ I FCRGTERNL 

DLSWSELKKEVTLAVENELQGSVTEYEFSQEEFRNLQQEFWCKF 

YACCLQYQEALSHPLALHLNPHTNMVCLLKKGYLSFLIPSSLVD 

HLYLLP YENLLTEDETT I S DDVDIARDV I CL I KCLRLI EES VT V 

DMSVIMEMSCYNLQSPEKAAEQILEDMITIDVENVMEDICSKLQ 

EIRNPIHAIGLLIREMDYETEVEMEKGFNPAQPLNIRMNLTQLY 

GSNTAGYIVCRGVHKIASTRFLICRDLLILQQLLMRLGDAVIWG 

TGQLFQAQQDLLHRTAPLLLSYYLIKWGSECLATDVPLDTLESN 

LQHLS VLELTDSGALMANRFVSSPQT I VELFFQEVARKHI I S HL 

FSQPKAPLSQTGLNWPEMITAITSYLLQLLWPSNPGCLFLECLM 

GNCQYVQLQDYIQLLHPWCQVNVGSCRFMLGRCYLVTGEGQKAL 

EC FCQAAS EVGKE EFLDRL I RSEDGE I VSTPRLQ Y YDKVLRLLD 

VIGLP ELVTQLATS AI TEAS DDW \ KSQATL\ RTCI FKHHL\ DLG 

\HNSQAYGSL* PQI PDSSRQLDCLRQLVWLCERSQLQDLVEFS 

YVNLHNEWG 1 1 ES RARAVDLMTHNY YELLYAFH I YRHNYRKAG 

TVMFEYGMRLGREVRTLRGLEKQGNCYLAALNCLRLIRPEYAWI 

VQP VSGAV YDR PGAS PKRNHDGECTAAPTNRQI E I LELEDLEKE 

CSLAR I R LTLAQHD P S AVAVAG S S S AE EMVTLLVQAGL FDTA I S 

LCQTFKLPLTPVFEGLAFKCIKLQFGGEAAQAEAWAWLAANQLS 

SVITTKESSATDEAWRLLSTYLERYKVQNNLYHHCVINKLLSHG 

VPLPNWLINSYKKVDAAELLRLYLNYDLLDLTPYQVIRICGC 


5826 


3 


871 


KSQLLRDHSAPPPKPCTSVGAMGC* PRQ/SPKEQQRQLKKQKNR 
AAAQRSRQKHTDKADALHQQHE S LE KDNLALRKE I QSLQAELAW 
WSRTLHVHERLCPMDCASCSAPGLLGCWDQAEGLLGPGPQGQHG 
CREQLELFQTPGSCYPAQPLSPGPQPHDSPSLLQCPLPSLSLGP 
AWAE P P VQLS PS PLL FASHTGS SLQGS S S KLSALQPS LTAQTA 
PPQPLELEHPTRGKLGSSPDNPSSALGLARLQSREHKPALSAAT 
WQGLWDPSPHPLLAFPLLSSAQVHF 


5827 


194 


2287 


GMGSENSALKSYTLREPPFTLPSGLAVYPAVLQDGKFASVFVYK 
RENEDKVNKAAKVP * * HLKTLRHPCLLRFLSCTVEADG IHLVTE 
RVQPLEVALETLSSAEVCAG I YDILLALI FLHDRGHLTHNNVCL 
SSVFVSEDGHWKLGGMETVCKVSQATPEFLRSIQS IRDPASI PP 
EEMSPEFTTLPECHGHARDAFSFGTLVESLLTILNEQVSADVLS 
S FQQTLHSTLLNP I PKWRPALCTLLSHDFFRNDFLEWNFLKSL 
TLKSEEEKTEFFKFLLDRVSCLSEELIASRLVPLLLNQLVFAEP 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VAV\ KSFLPYLLGPKKDHAQGETPCLLS PALFQSRVI PVLLQLF 
E VHEEHVRM VLLSH I EAYVGAT..S LREQLKKV\ I L\ PQ VLLG \ LR 
D\TSDSIVAITLHSLAVLVSLLGPEWVGGERTKIFKRTAP\SF 
TK\NTDLSLEGDPFSQPIKFPINGLSDVKNTSEDSENFPSSSKK 
SEEWPDWSGPE\EPENQTVNI\QIWP\REP\CDDVKSQCTTLDV 
EESSWDDCEPSS LDTKVNPGGG I TATKP VTS GEQKP I PALLS LT 
EESMPWKSSLPQKISLVQRGDDADQIEPPKVSSQERPLKVPSEL 
GLGEEFTIQVKKKPVKDPEMDWFADMIPEIKPSAAFLILPELRT 
VPKKJJDvS P VMQFSS KFAAAE ITEGEAEGWEEEGELNWEDNN 

W 


5828 


2 


257 


AREGGSLGAVAACGELSYSCDFCPARPHTSWLTRFVKMEFQAW 
MAVGGGSRMTDLTSSIPKPLLPVGNKPLIWYPLNLLERVGFEEV 
I WTTRD VQ KALCAE FKMKMKPDIVCI P DDADMGTADS LR Y I Y P 
KLKTDVLVLSCDLITDVALHEWDLFRAYDASLAMLMRKGQDSI 
EPVPGQKGKKKAVEQRDFIGVDSTGKRLLFMANEADLDEELVIK 
GS I LQKHPRI RFHTGLVDAHLYCLKKY I VDFLMENG \ S I TS I RS 
EL\IPYLV/RGKQFSSASSQQGTRKEKEGGSKGKRGLKSFRISY 
S F Y * KEAN YTGTGA P Y \ D\ AC W I 


5829 


260 


1259 


PDGRLIVSCSEDKTIKIWDTTNKQCVNNFSDSVGFANFVDFNPS 
GTCIASAGSDQTVKVWDVRVNKLLQHYQVHSGGVNCISFHPSGN 
YLI TAS SDGTLK I LDLLKGRL I YTLQGHTGP VFTVS FS KGGELF 
AS GGADTQ VLL W RTNFD E LH C KG LTKRNL KRLH FDS P P HLLD I Y 
PRTPHPHEEKVETVEDFFLHLLRLIQSLR* S ICRSLLPLLWISF 
LLILPQCX}KPWGLCQTRVKRPVDIS*TLP*CHQNVCQQPRKRK 
QKT*VTSPVKVK/VSIPLAVTDALEHIMEQLNVLTQTVSILEQR 
LTLTEDKLKDCLENQQKLFSAVQQKS 


5830 


4496 


3139 


GGKMAAPEERDLTQEQTEKLLQFQDLTGIESMDQCRHTLEQHNW 
N I EAAVQDRLNEQEGVP S VFNPP PSRPLQVNTADHR I YS YWSR 
PQPRGLLGWGYYLIMLPFRFTYYTILDIFRFALRFIRPDPRSRV 
TDPVGDIVSFMHSFEEKYGRAHPVFYQGTYSQALNDAKRELRFL 
LVYLHGDDHQDSDEFCRNTLCAPEVISLINTRMLFWACSTNKPE 
G YR VS Q ALRENT YP FLAM I MLKDRRE * P V \ VGRLEGL I \ Q PDDL 
INQLTFIMDANQTYLVSERLEREERNQTQVLRQQQDEAYLASLR 
ADQEKERKKREERERKRRKKEEVQQQKLAEERRRQNLQEEKERK 
LECLPPEPSPDDPESVKIIFKLPNDSRVERRFHFSQSLTVIHDF 
LFSLKESP\EKFQIEA\NFPRR\VLPCIPSEE\WPNPPTLQE\A 
GLS HTE VLFVQDLTDE 


5831 


71 


2897 


FCSKDKCCLYLPDSINRSKSCTAKPGAHSQDRHAVMDSERQVKD 
TDDIESPKRSIRDSGYIDCWDSERSDSLSPPRHGRDDSFDSLDS 
FGSRSRQTPSPDWLRGSSDGRGSDSESDLPHRKLPDVKKDDMS 
ARRTSHGEPKSAVPFNQYLPNKSNQTAYVPAPLRKKKAEREEYR 
KSWSTATSPAGLGKKALQDYGPRT\PVS\DDAESTSMFDMRCEE 
EAAVQPHSRARQEQLQLINNQLREEDDKWQDDLARWKSRKRSVS 
QDL I KKE E ERKKMEKLLAGEDGTS ERRKS I KTYRE I VQEKERRE 
RELHEAYKNARSQEEAEGILQQYIERFTISEAVLERLEMPKILE 
RSHSTEPNLSSFLNDPNPMKYLRQQSLPPPKFTA1VETTIARAS 
VLDTSMS AGSGS PS KTVTPKAVPMLTPKPYSQPKNSQDVLKTFK 
VDGKVSVNGETVHREEEKERECPTVAPAHSLTKSQMFEGVARVH 
GSPLELKQDNGSIEINIKKPNSVPQELAATTBKTEPNSQEDKND 
GGKS RKGNI ELAS S E PQH FTTTVTRCS PTVAFVE FPS S PQLKND 
VSEEKDQKKPENEMSGKVELVLSQKWKPKSPEPEATLTFPFLD 
Jvn feiAJN \2 Litiu VJN JUW £> U V u b P £> b c* KS P VTTPFKFWAWDPEEERRR 
QEKWQQEQERLLQERYQ\KEQDK\LKEE\WEKAQKEVEEEERRY 
YEEEP* 1 1 \EDPWPFTVSSSSADQLSTSSSMTEGSGTMNK1DL 
GNCQDEKQDRRWKKSFQGDDSDLLLKTRESDRLEEKGSLTEGAL 
AHSGNPVSKGVHEDHQLDTEAGAPHCGTNPQLAQDPSQNQQTSN 
PTHSSEDVKPKTLPLDKSINHQIESPSERRKSISGKKLCSSCGL 
PLGKGAAMI IETLNL YFHIQCFRCG\ ICKGQLGDAVSGTDVRIR 
NGLLNCNDCYMRSRSAGQPTTL 


5832 


2454 


829 


PGRRFRHGSCAFQKQCIMLHICQYFLQGECKFGTSCKRSHDFSN 
SENLEKLEKLGMSSDLVSRLPTIYRNAHDIKNKSSAPSRVPPLF 
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Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V -Valine, 
W*Tryptophan, Y°Tyrosine, X«Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








VPQGTSERKDSSGSVSPNTLSQEEGDQICLYHIRKSCS FQDKCH ' " 

RVHFHLPYRWQFLDRGKWEDLDNMELIEEAYCNPKIERILCSES 

ASTFHSHCLNFNAMTYGATQARRLSTASSVTKPPHFILTTDWIW 

YWSDEFGSWQEYGRQGTVHPVTTVSSSDVEKAYLAY/WYTGV*R 

PGSHLEVPGRKAQLRVRFQSLRSEKPGLWHN*KGLPQTQIR\AP 

QDVTTMQTCNTKFPGPKS I PDYWDSSALPDPGFQKITLSSSSEE 

YQKVWNLFNRTLPFYFVQKIERVQNLALWEVYQWQKGQMQKQNG 

GKAVDERQLFHGTSAIFVDAICQQNFDWRVCGVHGTSYGKGSYF 

ARDAAYSHHYSKSDTQTHTMFLARVLVGEFVRGNASFVRPPAKE 

GWSNAFYDSCVNSVSDPSIFVIFEKHQVYPEYVIQYTTSSKPSV 

TPSILLALGSLFSSRQ 


5833 


170 


3289 


S I LCLLS P CWQ FGKP WS I LS SRS RHS PCTKKG WEGMRKHLHT ~" 
RQGHK* VHVE I S KALW VYRDD YF I RHS I SVSAVI VRAW I THKYR 
GRDWNVKWEENLLHAVAKNYTLLQTIPPFERPFKDHQVCLEWNM 
GYIWNLRANRIPQCPLENDWALLGFPYASSGENTGIVKKFPRF 
RNRELEATRRQRMDYPVFTVSLWLYLLHYCKANLCGILYFVDSN 
EMYGTPSVFLTEEGYLHIQMHLVKGEDLAVKTKFIIPLKEWFRL 
DI S FNGGQ I VVTTS IGQDLKS YHNQT I S FREDFHYNDTAG YFI I 
GG SR Y VAGI EG FFG PL KYYRLRSLHPAQ I FNPLLEKQLAEQ I KL 
YYERCAEVQEIVSVYASAAKHGGERQEACHLHNSYLDLQRRYGR 
PSMCRAFPWEKELKDKHPSLFQALLEMDLLTVPRNQNESVSEIG 
G K I FE KA V KRL SSI DGLHQ ISSIVPFLTDSS CCG YHKAS Y YLAV 
FYETGLNVPRDQLQGMLYSLVGGQGSERLSSMNLGYKHYQGIDN 
YPLDWELSYAYYSNIATKTPLDQHTLQGDQAYVETIRLKDDEIL 
KVQTKEDGDVFMWLKHEATRGNAAAQQRLAQMLFWGQQGVAKNP 
EAAI EWYAKGALETEDPAL I YD YAI VLFKGQGVKKNRRLALELM 
KKAAS KGLHQ AVNGLGW Y YH KFKKNYA\ KAAKYWL KA\ E E \ MGN 
PDAS YNLGVLHLDGI FPGVPGRNQTLAGE YFHKAAQGGHMEGTL 
WCS L YYI TGNLETFPRDPEKAWWAKH VAEKNG YLGHV IRKGLN 
AYLEGSWHEALLYYVLAAETGIEVSQTNLAHICEERPDLARRYL 
GVNCVWRYYNFSVFQ I DAPS FAYLKMGDLYYYGHQNQSQDLELS 
VQMYAQAALDGDSQGFFNLALLIEEGTI I PHH I LDFLE I DSTLH 
SNNISILQELYERCWSHSNEESFSPCSLAWLYLHLRLLWGAILH 
SALI YFLGTFLLS ILIAWTVQYFQS VSASDPPPRPSQAS PDTAT 
STAS PAVTPAADAS DQDQPT VTNNPE PRG 


5834 


17 


4020 


RFRRGGGRVFPGAFPASPSDSLGQGNSQGPPRTPKPPRT/QECG 
SAAPGPIPGQSSS*VPLRLEQIQQKADCPLSLELALKPRMAAQV 
TLEDALSNVDLLEELPLPDQQPCIEPPPSSLLYQPNFNTNFEDR 
NAFVTG I ARY I EQATVHS SMNEMLEEGQEYAVMLYTWRSCSRAI 
PQ VKCN EQPNR VE I YE KT VE VLE P E VT KLMN FM Y FQRNA I ERF C 
GE VRR L CHAERRKD FVS EAYL ITLG KF INMFAVLDE LKNM KCS V 
KNDHS A Y KRAAQ FLRXMAD PQ S I Q E S QNLS M FLANHNKI TQS LQ 
Q^LEVISGYEELLADIVNLCVDYYENRMYLTPSEKHMLLKVMGF 
GL YLMDGS VSN I YKLDAKKR INLS K I DKYFKQLQ WPLFGDMQ I 
ELARYIKTSAHYEENKSRWTCTSSGSSPQYNICEQMIQIREDHM 
RFI SELARYSNS EVVTGSGRQEAQKTDAEYRKLFDLALQGLQLL 
SQWSAHVMEVYSWKIiVHPTDKYSNKDCPDSAEEYERATRYNYTS 
EE K FAL VE V I AM I KGLQ VLMGRMES VFNHAIRHT VYAALQD FS Q 
VTLMEPLRQAI KKKKNVIQS VLQAI RKTVCDWETGHE P FND PAL 
RGEKDPKSG*DIKVPRRAVGPSSTQLYMVRTMLESLIADKSGSK 
KTLRSSLEGPTILDIEKFHRESFFYTHLINFSETLQQCCDLSQL 
WFRE F FLE LTMGRR IQFPIEMSMPWI LTDH I LET KEAS MME YVL 
YSLDLYNDSAHYALTRFTSKQFLYDE IEAEVT^CFDQFVYTCLADQ 
IFAYYKVMAGSLLLDKRLRSECKNQGATIHLPPSNRYETLLKQR 
HVQLLGRS I DLNRL I TQR VS AAM YKSLE LAIGR FES EDLTS I VE 
LDGLLE INRMTHKLLSR YLTLDGFDAMFREANHNVSAP YGR I TL 
HVFWELNYDFLPNYCYNGSTNRFVRTVLPFSQEFQRDKQPNAQP 
QYLHGSKALNLAYSSIYGSYRNFVGPPHFQVTCRLLGYQGIAW 
MEELLKWKSLLQGTILQYVKTLMEVMPKICRLPRHEYGSPGIL 
EFFHHQLKDIVEYAELKTVCFQNLREVGNAILFCLLIEQSLSLE 
EVCDLLHAAPFQNILPRVHVKEGERLDAKMKRLESKYAPLHLVP 
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Amino acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=»Serine, T^Threonine, V^Valine, 
W=*Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








L I ERLGTPQQ I AI AREGDLLTKERL£(jGLSM FE V I L*TR IRS FLD 
DP I WRG PLPSNGVMHVDE CVE FHRLWS AMQFVYC I PVGTH E FTV 
EQCFGDGLHWAGCMIIVLLGQQRRFAVLDFCYHLLKVQKHDGKD 
EIIKNVPLKKMVERIRKFQILNDEIITILDKYLKSGDGEGTPVE 
HVRCFQPPIHQSLASS 


5835 


4209 


1904 


SGNIRMAQGSHQIDFQVLHDLRQKFPEVPEWVSRCMLQNNNNL 
DACCAVLSQESTRYLYGEGDLNFSDDSGISGLRNHMTSLNLDLQ 
SQNIYHHGREGSRMNGSRTLTHSISDGQLQGGQSNSELFQQEPQ 
TAPAQVPQGFNVFGMSSSSGASNSAPHLGFHLGSKGTSSLSQQT 
P R FNP I MVTLAPN I QTGRNT PTS LH I HGVP P P VLNS PQGN S I Y I 
RPYITTPGGTTRQTQQHSGWVSQFNPMNPQQVYQPSQPGPWTTC 
PASNPLSHTSSQQPNQQGHQTSHVYMPISSPTTSQPPTIHSSGS 
SQSSAHSQYNIQNISTGPRKNQIEIKLEPPQRNNSSKLRSSGPR 
TSSTSSSVNSQTLNRNQPTVY I AAS P PNTDELM S RS Q PKV Y I S A 
NAATGDEQVMRNQPTLFISTNSGASAASRNMSGQVSMGPAFIHH 
HPPKSRAI GNNSATS PRVWTQPNT \ EYTFK I TVS PNKPPAVS P 
GWSPTFELTNLLNHPDKYVETENIHHLTDPTLAHVDRISETRK 
LSMGSDDAAYTQDI * RISNS WLGMVAHACNSSALGGQDGRI I * A 
QEFETS WGN I WRLRL YRRF* NYAGMVAHTCSPS YS VD * ALLVHQ 
KARMERLQRELE IQKKKLDKLKSEWEMENNLTRRRLKRSNS I S 
QIPSLEEMQQLRSCNRQLQIDIDCLTKEIDLFQARGPHFNPSAI 
HNF YDN I G FVG PVP P KPKDQRS 1 1 KT P KTQDTEDDEG AQWNCTA 
CTFLNHPALIRCEQCEMPRHF 


5836 


361 


2303 


FHITMCGICCSVNFSAEHFSQDLKEDLLYNLKQRGPNSSKQLLK 
S D VNYQ CL FSAHVLHJLRG VLTTQP VE DERGNVFL WNGE I FS G I K 
VEAEENDTQILFNYLSSCKNESEILSLFSEVQGPWSFIYYQASS 
HYLWFGRDFFGRRSLLWHFSNLGKSFCLSSVGTQTSGLANQWQE 
VPAS\DFSELILSLLSFPDALFYNCILGNIFLGRILLKKMLIA* 
VKFQQTYQHLYQR* QMKPNCI LKNLLFL* I * CCHKLHWRLIAVT 
FPMCHLQERYFKSFLLMYT* KEVIQQFI DVLSVAVKKRVLCLPR 
DENLTANEVLKTCDRKANVAILFSGGIDSMVIATLADRHIPLDE 
PIDLLNVAFIAEEKTMPTTFNREGNKQKNKCEIPSEEFSKDVAA 
AAADSPNKHVSVPDRITGRAGLKELQAVSPSRIWNFVEINVSME 
ELQKLRRTRI CHL I RPLDTVLDDS IGCAVWFASRG I GWLVAQEG 
VKS YQSNAKWLTG IGADEQLAG YSRHRVRFQSHGLEGLNKE I M 
MELGRISSRNLGRDDRVIGDHGKEARFPFLDENVVSFLNSLPIW 
EKANLTIiPRGIGEKLLLRLAAVELGLTASALLPKRAMQFGSRIA 
KMEKINEKASDKCGRLQIMSLENLSIEKETKL 


5837 


4792 


903 


NGNAVAQAP VTNCC YLATGS KDQTIR I WS CSRGRG VM I LKLP FL 
KRRGGGIDPTVKERLWLTLHWPSNQPTQLVSSCFGGELLQWDLT 
QSWRRKYTLFSASSEGQNHSRIVFNLCPLQTEDDKQLLLSTSMD 
RDVKCWDIATLECSWTLPSLGGFAYSLAFSSVDIGS LAI GVG DG 
M IRVWNTLS I KNN YD VKNFWQGVKS KVTALC WH PT KEG CLAFGT 
DDGKVGLYDTYSNKPPQISSTYHKKTVYTIiAWGPPVPPMSLGGE 
GDRPSLALYSCGGEGIVLQHNPWKLSGEAFDINKLIRDTNSIKY 
KLPVHTEISWKADGKIMALGNEDGSIEIFQ\IPNLKLICTIQQH 
HKLVNTISWHHE\HGSPAQKLSYL\MPSGSQQCSPFTCHNLKNC 
P*KAAPESPSDPLQSPYRTPPQGHTAQDYPVWAWEPHIH*WEGL 
VFCFPIDGYSPGCWD\AFPGKEAPVAIFRG\HQGRLLCVAWSPL 
DPDCIYSG\ADDFCVHKWLTSMQDHSRPPQGKKSIBLEKKRLSQ 
PKAKPKKKKKPTLRTPVKLESIDGNEEESMKENSGPVENGVSDQ 
EGEEQAREPELPCGLAPAVSREPVICTPVSSGFEKSKVTINNKV 
ILLKKEPPKEKPETLIKKRKARSLLPLSTSLDHRSKBELHQDCL 
VLATAKHSRELNEDVSADVEERFHLGLFTDRATLYRMIDIEGKG 
HLENGHPELFHQLMLWKGDLKGVLQTAAERGELTDNLVAMAPAA 
GYHVWLWAVEAFAKQLCFQDQYVKAASHLLSIHKVYEAVELLKS 
NHFYREAIAIAKARLRPEDPVLKDLYLSWGTVLERDGHYAVAAK 
CYLGATCAYDAAKVLAKKGDAASLRTAAEIAAIVGEDELSASLA 
LRCAQELL1ANNWVGAQEALQLHES LOG QRLVFCLL ELLS RHLE 
E KQ LS EG KS S S S YHTWN TGTEG P FVER VT A VWKS I FS LDTP EQ Y 
QE AFQ KLQN I K YP S ATNNTPAKQLLLH I CHDLT LAVLS QQMAS W 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G^Glycine, 
H=Histidine, I^Isoleucine, K=Lysine, 
L-Leucine, M^Methionine, N«Asparagine , 
P-Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y=<Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








DEAVQALLRAWRS YDSGS FT I MQEVYSAFLPDGCDHLRDKLGD 
HQSPATPAFKSLEAFFLYGRLYEFWWSLSRPCPNSSVWVRAGHR 
TLSVEPSQQLDTASTEETDPETSQPEPNRPSELDLRLTEEGERM 
LSTFKELFSEKHASLQNSQRTVAEVQETLAEMIRQHQKSQLCKS 
TANGPDKNEPEV^AEQPLCSSQSQCKEEKNEPLSLPELTKRLTE 
ANQRMAKFPESIKAWPFPDVLECCLVLLLIRSHFPGCLAQEMQQ 
QAQELLQKYGNTKTYRRHCQTFCM 


5838 


110 


98 


KTM PH L LVT FRD VA IDFSQEEWECLD P AQRDL YRD VM LEN YS NL 
ISLDLESSCVTKKLSPEKEIYEMES\PSGRIWGWSTITFQYNG 
LGDNMECKGNLEGQVSKSEGLYMCVKITCEEKATESHSTSSTFH 
RII/HYQGKIVKCKECRQGFSYLSCLIQHEENHNI + KCSEVNKH 
RNTFSKKPSYI*HQ\KFRLGEKPYECMECGKAFGRTSDLIQHQK 
IHTNEKPYQCNACGKAFIRGSQLTEHQRVHTGEKPYDCKKCGKA 
FSYCSQYTLHQRIHSGEKPYECKDCGKAFILGSQLTYKQRIHSG 
EKPYECKECGKAFILGSHLTYHQRVHTGEKPYICKECGKAFLCA 
SQLNEHQRIHTGEKPYECKECGKTFFRGSQLTYHLRVHSGERPY 
KCKECGKAFISNSNLIQHQRIHTGEKPYKCKECGKAFICGKQLS 
EHQRIHTGEKPFECKECGKAFIRVAYLTQHEKIHGEKHYECKEC 
GKT F VRATQLT YHQR IHTGEKP YKCKECD KAF/HLWLT I LS EHQ 
RIHRGEKPYECKQCX3R/LFIRGSHL/NEHLRTHTGEKPYECKEC 
GRAFSRGSEHTLHQRIHTGEKPYTCVQCGKDFRCPSQLTQHTRL 
HN* EYS SHKI CMHS I ALAS LDFAHLQEKNPEN 


5839 


1 


2425 


GRPFPRPPRALPRLPLRGRRQDGRWTVDFEBCLKD\SPRFRAAL 
EEVEGDVAELELKL\DKLVKLCIA\MIDTGKAFCVANKQFMNGI 
RD\LAQNS\NNDA\WETKFAPSFLDSLQEMINFHTIL/L*PNS 
EIN * GHS FQNFVKEDLRKFKDAKKQFENSQ * KRKKI ALVKNAP V 
PSRPASLEL*KPPNILTATRKCFRHIALDYVLQINVLQSKRRSE 
I L KS M LS FM YAHLAF FHQ G YDL FS ELG P YMKD LGAQLDR L VG DA 
AKEKREMEQKHSTIQQKDFSRDDSKLKYNVDAANGIVMEGYLFK 
RAS NAF KTWNRRW FS I QNNQ WY Q KK F KDN P TVWEDLRLCT VK 
HCEDI ERRFCFE WSPTKSCMLQADSEKLRQAWI KAVQTS I \AT 
AYREKDDESEKLDKKSSPSTGSLDSGNESKEKLLKGESALQRVQ 
CIPGNASCCDCGLADPRWASINLGITLCIECSGIHRSLGVHFSK 
VRSLTLDTWEPELLKLMCELGNDVINRVYEANVEKMGIKKPQPG 
QRQEKEAYIRAKYVERKFVDKIFL*SLSPP\BQQKK\FVSKSSE 
EKRLSISKFGP\GDQVRASAQSSVRSNDSGIQQSSDDGRESI.PS 
TVSANSLYEPEGERQDSSMFLDSKHLNPGLQLYRASYEKNLPKM 
AEALAHG ADVNWANS EENKAT PL I QAVLGGS L VTCE FLLQNGAN 
VNQRDVQGRGPLHHATVLGHTGQVCLFLKRGANQHATDEEGKDP 
LS I AVE AANAD I VTLLRLARMNEE MRE S EGLYGQ PGDET YQD I F 
RDFSQMASNNPEKLNRFQQDSQKF 


5840 


698 


3610 


KHLHLPRQHLTTLWQISSPRWRSPQRAFMSALSKTQTQSAPALQ 
GLS S L LQS VTGN P VP ASEAAS QSTS AS P ANTT VYT I KGRNL PS S 
AQPFIPKSFNYSPNSSTSEVSSTSASKASIGQSPGLPSTAFKLP 
SNTKGFTATHNTSPAAPPTEVTICQSSEVSKPKL\ESESTSPSL 
\EMKIHNFLKGNPGFSVA*NLKHPNPAGSLGSSAPSESHPSDFQ 
RGPTSTSIDNIDGTPVRDERSGTPTQDEMMDKPTSSSVDTMSLL 
SKIISPGSSTPSSTRSPPPGRDESYPRELSNSVSTYRPFGLGSE 
SPYKQPSDGMERPSSLMDSSQEKFYPDTSFQEDEDYRDFEYSGP 
P P S AMMNLQKKPAKS I LKS S KLSDTTE YQ P I LS S Y SHRAQE FG V 
KSAFPPSVRALLDSSENCDRLSSSPGLFGAFSVRGNEPGSDRSP 
SPSKNDSFFTPDSNHNSLSQSTTGHLSLPQKQYPDSPHPVPHRS 
L FS PQNTLAAP TGH P PTS G VE K VLAS T I S TTST I E FKNMLKNAS 
RKPSDDKHFGQAPSKGTPSDGVSLSNLTQPSLTATDQQQQEEHY 
R I ETR VS S S CLDL PDS TE E KGAP I ETLG YHS ASNRRMSGE P I QT 
VESIRVPGKGNRGHGREASRVGWFDLSTSGSSFDNGPSSASELA 
SLGGGGSGGLTGFKTAPYKERAPQFQESVGSFRSNSFNSTFEHH 
LPPSPLEHGTPFQREPVGPSSAPPVPPKDHGGIFSRDAPTHLPS 
VDLSNPFTKEAALAHAAPPPPPGEHSGIPFPTPPPPPPPGEHSS 
SGGSGVPFSTPPPPPPPVDHSGWPFPAPPLAEHGVAGAVAVFP 
KDHSSLLQGTLAEHFGVLPGPRDHGGPTQRDLNGPGLSRVRESL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine / D=Aspartic Acid, E=* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M^Methionine, N-Asparagine , 
P=Proline, Q=Glutamine, R-Arginine, 
S«Serine, T=Threonine, V=Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLPSHSLEHLGPPHGGGGGGGSNSSSGPPLGPSHRDTISRSGII 
LRSPRPDFRPREPFLSRDPPHSLKRPRPPPARGPPFFAPKRPFF 
PPRY 


5841 


1908 


762 


GLRLFLVLTVWPMMKPSWLSRTEFSKRLLCRTLWCQSGWSSRSY 
TRSMLKMTTSINRRSRTSTKSTRTSARPGLTATVSIGLSDSPTW 
RHCWMTARSCSGEKGGHWAPRQVGVYLLPGRVGCVSSRVSPSFP 
GDGLDSGLARRGSAVSALASGLVEEPMLGPPFHPTPRFKAVSAK 
SKEDLVSQGFTEFTIEDFHNTFMDLIEQVEKQTSVADLLASFND 
QSTSDYLWYLRLLTSGYLQRESKFFEHFIEGGRTVKEFCQ\QE 
\ VE PM CKE S DH I H 1 1 ALAQGLQ R VH PGW E YMG P RPRAATTNPH I 
FP *GLPS PKVYLLYRPG\HYD I LYKIGLGSS PLGCPGCPLLARA 
LGHCYRGFSVWKWSYFTPFFLSHDPPPMFY 


5842 


307 


1918 


QEPTADFKLRSTCGCGREMTCPDKPGQLlNWFICSLCVPRVRKL 
WSSRRPRTRRNLIiLGTACAIYLGFLVSQVGRASLQHGQAAEKGP 
HRSRDTAEPSFPEIPLDGTLAPPESQGNGSTLQPNWYITLRSK 
RSKPANIRGTVKPKRRKKHAVASAAPGQEALVGPSLQPQEA\EG 
KLML*HLGTLREQTWLRLESDPGGWCGVRE/WRAGGPDFLQPSS 
RESNIRIYSESAPSWLSKDDIRRMRLLADSAVAGLRPVSSRSGA 
RLLVLEGGAPGAVLRCGPSPCGLIjKQPLDMSEVFAFHLDRILGL 
NRTLPSVSRKAEFIQDGRPCPIILWDASLSSASNDTHSSVKLTW 
GTYQQLLKQKCWQNGRVPKPESGCTEIHHHEWSKMALFDFLLQI 
YNRLDTNCCGFRPRKEDACVQNGLRPKCDDQGSAALAHIIQRKH 
DP R H L V F I DNKG F FD RS EDNLN F KLL EG I KB F PAS A VY VLKS QH 
LRQKLLQSLFLDKGYWESQGGRQGIEKLIDVIEHRAKILITYIN 
AHGVKVLPMNE 


5843 


500 


1453 


GTARLVTCWVLHGQ* VKKPAWEPGWWL*Q * RCR PKGWGLGAGM 
RGSRMSQPPQCLRRAQSSCCHFMVKLLDDGTFMIPGEKVAHTSL 
DALVTFHQQKP I E PRRELLTQPCRQKDPANVDYEDLFLYSNAVA 
EEAACPVSAPEEASPKPVLCHQSKERKPSAEM/RQNNHQGSHFL 
LPPKIPSWRDPPETLEEPQNAPRERPEGPAAAKKPPRHCELWT 
LGCPEIHGDLRPWDRKRQPRSLRGSHLGGQRLHGSLCGHISQKP 
LTAPGTKRQKGPHQEGREVGQLH+GDPRGQELAPNGSESPILPG 
VQARAPGLGRA 


5844 


202 


2471 


FDSA VLS S I NVMAVL PG PLQLLG VLLT ISLSSIRLI QAGA Y YG I 
KPLPPQIPPQMPPQIPQYQPLGQQVPHMPLAKDGLAMGKEMPHL 
QYGKEYPHLPQYMKEIQPAPRMGKEAVPKKGKEIPLASLRGEQG 
PRGEPGPRGPPGPPGLPGHGIPGIKGKPGPQGYPGVGKPGMPGM 
PGKPGAMGMPGAKGEIGQKGEIGPKGIP*PQGPPGPHGLPGIGK 
PGGPGLPGQPGPKGDRGPKGLPGPQGLRGPKGDKGFGMPGAPGV 
KGPPGMHGPPGPVGLPGVGKPGVTGFPGP\QGPLGK\PGAPGEP 
GPQGPI GVPGVQGP PGI PG IGKPGQDG\ I PGQPGFPGGKGEQGL 
PGLPGPPGLPGIGKPGFPGPKGDRGMGGVPGALGPRGEKGPIGA 
PGIGGPPGEPGLPGIPGPMGPPGAIGFPGPKGEGGIVGPQGPPG 
PKGEPGLQGFPGKPGFLGEVGPPGMRGFPGPIGPKGEHGQKGVP 
GLPGVPGLLGPKGEPGIPGDQGLQGPPGIPGIGGPSGPIGPPGI 
PGPKGEPGLPGPPGFPGIGKPGVAGLHGPPGKPGALGPQGQPGL 
PG P PG P PG P PG P PA VM PPT P P PQG E YLPDMGLG I DG VKP PHA YG 
A KKG KNGG PA YEM P AFTAELTAP FP P VGAP VKFNKLL YNGRQN Y 
N PQTG I FTCE VPGVYYFAYHVHCKGGNWJ VALFKNNE P VMYTYD 
EYKKGFLDQASGSAVLLLRPGDRVFLQMPSEQAAGLYAGQYVHS 
SFSGYLLYPM 


5845 


215 




HASNKSASLQDKMANPKEKTAMCLVNELARFNRVQPQYKLLNER 
G PAHS KM FS VQ LS LG EQTWES EG S S I KKAQQAVGNKALTES TLP 
KPI*KPPKSNVNNNPGCITPTVELNGliAMKRG\KPAIHRPLDPK 
P F PNNRANYN FQVM YNQRYHCP I PK I FYVQLTVGNNE F FGEGKT 
RQAARHNAAM KALQALQNEPI PERS PQNGESGKDMDDDKDANKS 
EISLVFEIALKRNMPVSFEVIKESGPPHMKSFVTRVSVGEFSAE 
GEGNSKKLSKKRAATTVLQELKKLPPLPWEKPK\HFFKKRPKT 
IVKAG PE YGQGMNP I SRLAQIQQAKKEKE PDYVLLSERGMPRRR 
EFVMQVKVGNEVATGTGPNKKIAKKNAAEAMLLQLGYKASTNLQ 
DQLEKTGENKGWSGPKPGFPEPTNNTPKGILHLSPDVYQEMEAS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


/1U, *" W c*<_;j.u utjgiuenu containing signal peptide 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NsAsparagine, 
P=Proline. 0=Glutainine Ra&rnin^no 
S=Serine, T=Threonine, V=Valine, 
"^Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








RHKVI SGTTLGYLS pkdmnqpsss ffs I sptsnssati arellm 

NGTSSTAEAIGLKGSS PTPPCS PVnpc; vm .v vt no t no pmmvp 
DRQSGKECVTCLTLAP VQMTFHAIGSS IEASHDQV+ YATAILLC 
YG PAR K WKA I KMEAMCAHAALLSL I HYLIAPSARLE KS KLFALG 

N- 


5846 


1126 


456 


FSKLIMKTFIIGISGVTNSGKTTLAKNLQKHLPNCSVISQDDFF 
KPESEIETDKNGFLQYDVLEALNMEKMMSAISCWMESARHSWS 
TDQES AEE I P I LI I EGFLLFNYKPLDTI WNRS YFLT I P YEECKR 
RRSTRVYQPPDSPGYFDGHVWPMYLKYRQEMQDITWEWYLDGT 

KSEEDLFLQVYEDLIQELAKQKCLQVTA*RRNTTNPS/CK* IRK 
LQGVI 


5847 


2769 


505 


APEMEDLSSPDSTLLQGGHNLLSSAS FQESVTFKDVIVDFTQEE 
WKQLDPGQRDLFRDVTLENYTHLVSIGLQVSKPDVISQLEQGTE 
PWIMEPSIPVGTCADWETRLENSVSAPEPDISEEELSPEVIVEK 
HKRDDSWSSNLLESWEYEGSLERQQANQQTLPKEIKVTEKTIPS 
WEKGPVNNEFGKSVNVSSNLVTQEPSPEETSTKRSIKQNSNPVK 
KEKSCKCNECGKAFSYCSALIRHQRTHTGEKPYKCN*/CVEKAF 
SRSENLINHQRIHTGDKPYKCDQCGKGFIEGPSLTQHQRIHTGE 
KPYKCDECGKAFSQRTHLVQHQRIHTGEKPYTCNECGKAFSQRG 
riri*janyi\xni^iiK.t'i'is.cJJiJ,L.L>Ki r TRSTHLTQHQKIHTGEKTYK 
CNECGKAFNGPSTFIRHHMIHTGEKPYECNECGKAFSQHSNLTQ 
HQKTHTGEKPYDCAECGKSFSYWSSLAQHLKIHTGEKPYKCNEC 
GKAFSYCSSLTQHRRIHTREKPFECSECGKAFSYLSNLNQHQKT 
HTQEKAYECKECGKAFIRSSSLAKHERIHTGEKPYQCHECGKTF 
* S YGS S L I QHR K I HTGE R P YKCNECGRA FNQN I HLTQHKR I HTGA 

KPYECAECX3KAFRHCSSLAQHQKTHTEEKPYQCNKCEKTFSQSS 
HLTQHQRIHTGEKPYKCNECDKAFSRSTKLTQHQRIHTGEKPYK 
CNE CG K \ TFS QS T Y L IQHQR I HSG E K P FGCNDCGKS FR YR S ALN 
KHQRLHPGI 


584B 


22 


2961 


AAPRRLLRGGDGDRTPRFPLPALLRPGPPAEAAPERRKMPAVSK " 
GDGMRGLAVF I S D I RNCKS KE AE I KR INKELiAN I RS KFKGDKAL 
DG YS KKK YVCKLL FI FLLGHD I DFGHMEAVNLLSSNR YTE KQ IG 
YLFISVLVNSNSELIRLINNAIKNDLASRNPTFMGLALHCIASV 
GSREMAE AFAGE I PKVLVAGDTMDS VKQ SAALCLLRL YRTS PDL 
VPMGDWTS RVVHLLNDQHLG WTAATSLI TTLAQKNPEEFKTSV 
S LAVS RLS \RI VTSAS TDLQD YT Y * FCPG FLGLS VKLLRLLQCY 
PPPDPAVRGRLTECLETILNKAQEPPKSKKVQHSNAKNAVLFEA 
ISLIIHHDSEPNLLVRACNQLGQFLQHRETNLRYLALESMCTIiA 
SSEFSHEAVKTHIETVINALKTERDVSVRQRAVDLLYAMCDRSN 
APQIVAEMLSYLETADYSIREEIVLKVAILAEKYAVDYTW\YVD 
TI LNL IRIAGDYVSE E VWYR V I Q I VINRDD VQG YAAKTVFEALQ 
APACHENLVKVGGYILGEFGNLIAGDPRSSPLIQFHLLHSKFHL 

CSVPTRAT.7.T.QTVT IfTnTKTT T7DT7\rVr>T<T/-\T-*n7r nnno/>T 

v.o vri ivrtuijijo i xx js,tf viMLir tra v a.P I IQDVLRSDSQLRNADVEL 
QQRAVEYLRLSTVASTDILATVLEEMPPFPERESSILAKLKKKK 
GPSTVTDLEDTKRDRSVDVNGGPE PAPAS TSAVSTPSPSADLLG 
LGAAP PA PAG PP PS S GG SGLLVD VFS DS AS WAPLAPGS EDN FA 
R F VCKNNG VTj FENQLLQ I GLKS E FRQNLG RM F I F YGNKTS TQFL 
NFTPTLICSDDLQPNLNLQTKPVDPTVEGGAQVQQWNIECVSD 
FT E A P VLN I Q FR YGGTFQNVS VQL P I TLNKFFQ PTEMAS QD FFQ 
RWKQLSNPQQEVQNI FKAKHPMDTEVTKAKI IGFGSALLEEVDP 
NPANFVGAGIIHTKTTQIGCLLRLEPNLQAQMYRLTLRTSKEAV 
SQRLCELLSAQF 


5849 " 


3545 


1895 


KRREIKETVFHHVAQAGLELLSSSNPPSSASRSAGITGMRHQVO 
P * DPCMSLS P P CFTEE DR FSLE ALQT I HKQMDDD KDGG I EVEES 
DEFIREDMKYKDATKKHSHLHREDKHITIEDLWKRWKTSEVHNW 
TL E DTLQ W L I E F VEL PQ YE KNFRDNNVKG TTL PR I AVH E P S FM I 
S QLKI S DRSHRQKLQLKALDWLFGPLTRP PHNWMKDF I LTVS I 
VIGVGGCWFAYTQNKTSKEHVAKMMKDLESLQTAEQSLMDLQER 
LEKAQEENRNVAVEKQNL*RKMMDEINYAKEEACRLRELREGAE 
CELSRRQYAEQELEQVRMAIjKKAEKEFELRSSWSVPDALQKWLQ 
LTHEVE VQ YYN I KRQNAEMQLAI AKDEAEKI KKKRSTVFGTLHV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rlcU-LLLcU cJlU 

nucleotide 
location 
corresponding 
to first 
amino acid 

tpqi Hup fj'F 

uj. 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, [^Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
s-berine, i = inreomne, v^valine, 
W=Tryptophan, Y-Tyrooine, X«Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








AHSSSLDEVDHKILEAKKALSELTTCLRERLFRWQQIEKICGFQ ' 

IAHNSGLPSLTSSLYSDHSWVVMPRVSIPPYPIAGGVDDLDEDT 

PPIVSQFPGTMAKPPGSLARSSSLCRSRRSIVPSSPQPQRAQLA 

PHAPHPSHPRHPHHPQHTPHSLPSPDPDILSVSSCPALYRNEEE 

EEAIYFSAEKQWEVPDTASECDSLNSSIGRKQSPP/SKPRDIPN 

1 1 S / DERYQEMRCP * R I PSGGIL 


5850 


3 


1895 


KAVLNFSASGSVISLTGSNPMHDASMWHLKKNGIIVYLDVPLLN 
LICRLKLMKTDRIVGQNSGTSMKDLLKFRRQYYKKWYDARVFCE 
SGAS PEE VADKVLNAI KRYQDVDS ETFI S TRHVWPEDCEQKVSA 
E F F I EAV I EG LAS DGGLF V PAKE F P KLS CG E W KS LVGAT YVERA 
Q I L LERC I H PAD I PAAR LG EM I E TA YGENFACS KI AP VRHLSGN 
QFILELFHGPTGSFKDLSLQLMPHIFAQCIPPSCNYMILVATSG 
DTGSAVLNGFSRLNKNDKQRIAWAFFPENGVSDFQKAQIIGSQ 
RENGWAVGVESDFDFCQTAIKRIFNDSDFTGFLTVEYGTILSSA 
NS INWGRLLPQVVYHASAYLDLVSQGFISFGS PVDVCI PTGNFG 
NILAAVYAKMMGIPIRKFICASNQNHVWTDFIKTG\HYDLRGKE 
N* AQTFFTVQ* I FLPNLSNLERHLHLMANKDGQLMTELFNRLES 
QHHFQIEKALVEKLQQDFVADWCSEGECLAAINSTYNTSGYILD 
PHTAVAKWADR VQD KTCPVIISS TAH YS KFAP A I MQAL K I KE I 
NETS S SQL YLLGSYNALPPLHEALLERTKQQEKMEYQVCAADMN 
VLKSHVEQLVQNQFI 


5851 




1802 


RCYLQFLALLLTSTSARAAAAIAAAEEPAGSPSVMTRAGDHNRQ '"" 
RGCCGS LAD YLTS AKFLLYLGHSLS TWGDRMWHFAVS VFLVEL Y 
GNS LLLTAVYGL WAGS VLVLGA 1 1 GDWVDKNARLKVAQTS LW 
QNVSVILCGIILMMVFLHKHELLTMYHGWVLTSCYILIITIANI 
ANLASTATAITIQRDWIVWAGEDRSKLANMNATIRRIDQLTNI 
LAPMAVGQIMTFGSPVIGCGFISGWNLVSMCVEYVLLWKVYQKT 
PALAVKAGLKEEETELKQLNLHKDTEPKPLEGTHLMGVKDSNIH 
ELEHEQE PTCAS QMAEP FRTFRDGWVS Y YNQP VF / LGWHG S CFP 
LYDCPGL*LHHHRVRLHSGTEWFHPQYFDGSISYNWNNGNCSFY 
LAT S KM W FGS DRS DLR IGTAFL FDL VCDLC IHAW K P PG L VR FS F 


5852 


1 


422 


KTTFPSSLCPLRQLPEVRGYSGQPLTDPLISLCRSHKCRGKGWG 
SSSYPSLPALLRARSAPGHCTHRSCGPEWRIDSISRLEMQGARR 
SGWAQAQPTILLLVPRLRKSLPSIWG/SLMGPFITSGPG/WFRQ 
YYFFISGRH*VLFTESDFYYVAMDFGGHGLSSHYSPGVPYYLQT 
FVSEIRRWAGKKQSVYFRRCGGCSRAPPLITGGGVGSRKQRWP 
CibtjAV/AJjAPGLPAIHGRSWES 


5853 


223 


1346 


RLLGLSRVKGLHGPAASAWISDPETRGDPGGPWGMWRGSDLRPR 
PVSLTGLTLVCK*AAQGPQV\HSVKLCFGLGG\PCLL\FPIFRP 
LLLHPRRPRLHPGTRGVAVEPHALRWHVAHGEEAGIRAAGPGH 
GGVEIPQG/VGSLGARRGLRPSRPSSRHRNRVPAPPPGRPLATP 
HRRRFPPDPALTCPGLGQDQGPREQQKQGSGRHDTILGDWGESE 
SRWVRGNFRTGTAATLIGFSRNPTLNGSENWGSLVSIQEEGPDT 
uw bKbitKN PAbMGNPQRWAS P IHTPPLG PE ILRAMPEALRAMPE 
ALGLRPDPATSVPSALS/QTF/PESWPRSCLRNQGETLGMGPVP 
LSSLCITESPSQNWTPCLLLLTCPRGLF 


5854 


86 


938 


KG RNTAP E KKGAALNNRENAS S * NG Y / S RWKQD I RR I ENH 1 1 QE 
LKHLCAMIKRVLLERLENTRKLRELTEGRTLDWPQNRITEVSAK 
RQIVTEYREKGKRN*EEKKRDLEGRSRRYNLCIIGIPETEDRAS 
GABTIKDLLE/ENFPELKNELDLQMEKAHRIPLKFNEKKAASRH 
IRVTFL/KFORRNILOASSORFCOVTYKnATOrPT.TQnPCDaTT ma 

rrqw/n/ pi SRVLRENNFEPRI I YSAKLSFLYKGNWKTFLDIQG 
LGKYINQELSLKILLKDLLQLTENLN 


5855 


536 


2391 


LRSYGCKAPSRISHLHK\FLFLLLPSLLMGYSESPPPITDSWAP 
F I S LTHH VLS QSQSPLSSNCWICLSTHTQ* FTALPADLLTWTQS 
NVSLHISYLAIPFLADSFLKPV/L*PGNSAKHLSFKLSSLSMVS 
GRAVALLHL IASGLTS IQTNTASS KPPI WGY\LSTQTS F I SPPP 
LCLSRTYPNPAHATMVGQVPQSLCGLIFTL/RTPCRPSXLHPNY 
KIISTSAWQKVLCFSGSPTIHTSLHLTTGSSFLSFHPIPGFPAA 
NSALYVSSLKGPPGKNVTIPSPVTGT+QPPHRGSN/RLTVDKDN | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

amino api H 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K^Lysine, 
LsLeucine, M=Methionine, N-Asparagine, 
P=Proline, Q-Glutamine, R-Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FFLSPKPNSLHQLPSQ\TPYQALTGAALAGSYPIWENENTLSWL 
PTFTYNFCLSTPSLFFLCDTN*YLCLPANWSGTCTLVFQAPTIN 
I LP PNQT I L I S VEAS I S SS P I RNKWALHL I TLLTGLG I TAALGT 
GIAGITTS ITS YQTLFTTLSNTVEDMHTS ITSLQRQLDFLVGVI 
LQNWRVLDLLTTEKGGTCIYLQEECCFCVNESGIVHIAVRRLHD 
RAAEL*HQVADSWWQGSSLLRWIPWVAPFLGPLIFLFLLLMIGP 
C I FNLVS R F I S QR LNC F I QAS M QKH I DN I FHLCHV * YQS LRGNH 
SEAPEPRP 


5856 


173 


1137 


PWLHGLGLSAVFLFYL* /YVTFHLYGGIILLLLIFISIAGILYK 
FQDVLLYFPEQPSSSRLYVPMPTGIPHENIFIRTKDGIRLNLIL 
IRYTGDNSPYSPTIIYFHGNAGNIGHRLPNAliLMLVNLKVNLLL 
VDYRGYGKSEGEASEEGLYLDSEAVLDYVMTSPDLDKTKIYLSG 
RSLG\GAAAIHLASDNSHRISAIMVENTFLSIPHMASTLFSFFP 
MRYLPLWCYKNKFLSYRKISQCRMPSLFISGLSDQLIPPVMMKQ 
LYELSPSRTKRLAIFPDGTHNDTWQCQGYFTALEQFIKEWKSH 
S PEEMAKTS SNVTI I 


5857 


1597 


563 


KL IGKVLVLS WADAMAAFAVE PQGPALGS E PMMLGS PTS PKPG 
VNAQFLPGFLMGDLPAPVTPQPRSISGPSVGVMEMRSPLLAGGS 
P PQ P WPAH KD KSGAP P VRS I YDD I S S PGLGS T P LTS RRQ PN I S 
VMQSPLVGVTSTPGTGQSMFSPASIGQPRKTTLSPAQLDPFYTQ 
GDSLTSEDH\LDDSWGDCIWGFLKASA\SYILL\QFAQYGGIS* 
NM WMSNTGN WMH I R YQ S KLQAR KALS KDGR IFGESIMIGVKPCI 
DKSVMESSDRCALSSPSLAFTPPIKTLGTPTQPGSTPRISTMRP 
LATAYKASTSDYQVISDRQTPKKDESLVSKAMEYMFGW 


5858 


355 


1419 


PPHQPAAASTSXHQQQQPPPPPQDSSKPVVAQGPGPAPGVGSAP 
PASSSAPPATPPTSGAPPGSGPGPTPTPPPAVTSAPPGAPPPTP 
PSSGVPTTPPQAGGPPPPPAAVPGPGPGPKQGPGPGGPKGGKMP 
GGPKPGGGPGLSTPGGHPKPPHRGGGEPRGGRQHHPPYHQQHHQ 
GPPPGGPGGRSEEKISGPRRGFKANLSLLRRPGEKTYTQRCRFC 
LLGIYLLISRRMNSRRLFAKIWENQEKFLSTKAKDSEFIKLESR 
ALA*NCPKPELG+YTP*GGRQLPSSLFPTHACLPLSCSVIFSPF 
MFPQ*NCWGRKPFRPNLGPHLKGAVCNRWDDPWEGPTGKGHCLN 
FAS 


5859"" 


307 


1503 


GGSSARPRASSRRMLSRKKTKNEVSKPAEVQGKYVKKETSPLLR 
NLMPSFIRHGPTIPRRTDICLPDSSPNAFSTSGDGWSRNQSFL 
RTPIQRTPHEIMRRESNRLSAPSYLARSLADVPREYGSSQSFVT 
EVSFAVENGDSGSRYYYSDNFFDGQRKRPLGDRAHEDYRYYEYN 
HDLFQRMPQNQGRHASGIGRVAATSLGNLTNHGSEDLPLPPGWS 
VDWTMRGRKYYIDHNTNTTHWSHPLEREGLPPGWERVESSEFGT 
YYVDHTNKKAQY\RHPCAPTCTSV+STTSCHI/AS/RQQTERNQ 
SLLVPANPYHTAEIPDWLQVYARAPVKYDHILKWELFQLADLDT 
YQGMLKLLFMKELEQIVKMYEAYRQALLTELENRKQRQQWYAQQ 
HGKNF 


5860 


2956 


1270 


TIRVEEFPLCPGGGKAQLSSASLLGAGLLLQPPTPPPLLLLLFP 
IjLLFSRLCGALAGPI I VEPHVTAVWGKNVSLKCLIE VNETITQ I 
SWEKIHGKSSQTVAVHHPQYGFSVQGEYQGRVLFKNYSLNDATI 
TLHNIGFSDSGKYICKAVTFPLGNAQSSTTVTVLVEPTVSLIKG 
PDSLIDGGNETVAAICIAATGKPVAHIDWEGDLGEMESTTTSFP 
NE TAT IISQYKLFPTR FARGRR I TC WKH PALEKD IR YS FI LD I 
QYAPEVSVTGYDGNWFVGRKGVNLKCNADANPPPFKSVWSRLDG 

W*» r ltjuimxo ui\ lljrlr vn.tr LiX V i\J I ovjV jl ILJvV 1 \NolrGoKEVTQK 

VHPTFQDPSLPTYPPLPALQFQWASPSTA*TSRD\LATEP*KIA 
PSPLSTL\ATIKGWTQLPTIIA*CSGVGALFIV\LVKCFGLGIF 
CYRRRRTFRGDYFAKNYIPPSDMQKESQIDVLQQDELDPYPDSV 
KKENKNPVNNLIRKDYLEEPEKTQWNNVENLNRFERPMDYYEDL 
KMGMKFVSDEHYDENEDDLVSHVDGSVISRREWYV 


5861 


2051 


1305 


EVCACVQAFWLVASSGDDSQGGDKCGCEVGSWVGSMRWMAkLL 
SEGEQG I PTACAAFAQQPAG/E PRRGLAGVGEGG PQCS WVNYRC 
TLEFLVSLLGTDLARGRGNSASGPTAPADSKQL»/ML*DVHRRVI 
LE + RMNSGS PARDNAPSQRFCTNLS EGLRFG I S PSWREAL YGCH 
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Amino acid segment containing signal peptide 
(A=*Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I-Isoleucine, K«=Lysine, 
L«Leucine, M=»Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=* Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
^possible nucleotide insertion) 








A 


5862 


1556 


483 


PPFQLIMGE I KVS PDYNVJFRGTVPLKKI I VDDDDS KI WSLYDAG 
PRS I RCPLI FLPPVSGTADVFFRQ I LALTGWGYRVTALQYPVYW 
DHLEFCDGFRKLLDHLQLDKVHLFGASLGGFLAQKFAEYTHKSP 
RVHSLILCNSFSDTSIFNQTWTANSFWLMPAFMLKKIVLGNFSS 
G P VD PMMADAI D FM VDRLE S LGQS E LAS RLTLNCQNS Y VE PHK I 
RDIPVTIMDVFDQSALSTEAKEEMYKLYPNARRAHLKTGGNFPY 
LCRS AE VNLYVQI HL/R / RNS ME PNTRPLTHQWS VPRS LRCRKA 

ALASARRSSSVSLAVNDELTRCVLV*SVASAPVSRPFPSGSSGS 
PVLTVSGK 


5863 


2714 


249 


PFPSRGSLPLAAPREDTMGPLMVLFCLLFLYPGLADSAPSCPQN 
VNISGGTFTLSHGWAPGSLLTYSCPQGLYPSPASRLCKSSGQWQ 
TPGATRSLSKAVCKPVRCPAPVSFENGIYTPRLGSYPVGGNVSF 
ECEDGFI \LRGS PVRQCRPNGMWDGETAVCDNGAGHCPNPG I SL 
GP\VRTGFRFGHGDKVRYRCSSNLVLTGSSERECQGNGVWSGTE 
PICRQPYSYDFPEDVAPALGTSFSHMLGATNPTQKTKESLGRKI 
Q IQRS GHLNL YLLLDCS QS VS END FL I F KESAS LMVDR I FS FE I 
NVSVAIITFASEPKVLMSVLNDNSRDMTEVISSLENANYKDHEN 
GTGTNT YAALNS V YIiMMNNQM RLLGM E TM AW \Q E I RHA 1 1 LL \T 
DGK\SHMGGSPKTAVDHIREILNINQKRNDYLDIYAIGVGKLDV 
DWRELNE LGS KKDGERHAF I LQDT KALHQVFBHMLDVS KLTDTI 
CX3VGNMSANASDQERTPWHVTIKPKSQET\C\RGALISDQWVLT 
AAHCFRDGNDHSLWRVNVGDPKSQWGKEFLIEKAVISPGFDVFA 
KKNQGIL\EFYGD\DIALL\KLAQKVKM\STHCQGPSCLP\CTM 
\EANLGFLRETFKGSTCR\DHENEL/VWNKQSV\PAHF\VAL\N 
GS KLE HLTLRMG VE WTS C CRGLS P KKKTM \ F PNLT \ D VRE \ WT 
D\QFL\CS\GPQEDESP\CK*E\SGGA\VFLERRFRLSAGGVWC 
SWGL\YNP\CLGSA\DKNSPKKGPSVAKVPPPTR/DFHIN\LFP 
Q * S PWLRQHPGGMS * I FLPLLANGHLS P FACPAR I CRPLHFLPS 
EWATLRTL 


5864 


173 


1013 


PLISVPQSLISLPQPLLCFPGGQEPSAPSPCLYSFLWACSFTMG 
KLPPSIPPSSPLACVLKNLKPLQLTPDLXPKCLIFFCNTAWPQY 
KLDNDSK*PENGTFEFSILQVLDNSCHKMGKWSEVPDVQAFF\S 
HWSLPSLCSQC/GLIPNLSSFSPFCSFG/PPPQVPSP/TESFFS 
MDSSDLPPSPQAAPRQAEPGPNSHLASAPPPYNPFITSPPHTWS 
SLQFHSVTSPPPPAQQFTLKKVAGAKGIVKVSAPFSLSQIR*RL 
GSFSSNIKIQPSSWLIWQQP 


5865 


568 


1684 


CLPGPRWGEGWRAGHTI VGCI FFKTAI I SHFKGGM YLCVCMCTC 
LSVCVCVQVGSWICV/CVSMCACVSLCTC\ICRCISMYTREHAC 
ACTRV*VYMCMS/VCTCVSTCIDVRVC7UiVCVYMCLCLGYA*AC 
TCV*MCVCMHEHVCMC/VCACSCVLL/CRGHICM/MCMSAYICI 
/ CVY VC VLCVWACMRMSTCVWLVYG * ACTCVWMHM /CS CTCR /C 
VHVC CMS MHACE CL CVYLH I CGCAGTRRW WAGSARGS R S CS R LP 
CWAPGPGLSLPGPSCPSVEQGLGGGPGQLQGRSGEARLGEHRGW 
GSPAAVCSRNCTVSPRRGADCFEAPDVPKQPPGWGRASFEERGC 
GGRGW VCAP PLNGPQCCCFS I KP ELKAKKKK 


5866 


98 


3197 


ARPEVPAPPAWLSRRGAAKMGDKKDDKDSPKKNKGKERRDLDDL 
KKEVAMTEHKMSVEEVCRKYNTDCVQGLTHSKAQEILARDGPNA 
LTPP PTT PSW VKFCRQLFGG FS I LLW IGAI LCFLAYG I QAGTED 
DPSGDNLYLGIVLAAWIITGCFSYYQEAKSSKIMESFKNMVPQ 
QALV I REGE KMQVNAE EVWGDLVEI KGGDRVPADLR 1 1 S AHGC 
KVDNSSLTGESEPQTRSPDCTHE\NPLKTRNITFFSNNFVEGTA 
RGVWATGDRTVMGRIATLASGLEVGKTPIAIEIEHFIQLITGV 
AVFLGVSFFILSLILGYTWLEAVI FLIGI I VANVPEGLLATVTV 
CLTLTAKRMARKNCLVKNLEAVETLGSTS T I CSDKTGTLTQNRM 
TVAHMWFDNQIHEADTTEDQSGTSFDKSSHTWVALF*H/LLGFC 
NRPVFKGGQDNI PVLKRDVAGDASESALLKCIELSSGSVKLMRE 
RNKKVAE I PFNSTNKYQLS IHETEDPNDNRYLLVMKGAPERI LD 
RCSTILLQGKEQPLDEEMKEAFQNAYLELGGLGERVLGFCHYYL 
PEEQFPKGFAFDCDDVNFTTDNLCFVGLMSMIGPPRAAVPDAVG 
KCRSAGIKVIMVTGDHPITAKAIAKGVGI I FEGNETVEDIAARL 
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Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P= Proline, Q^Glutamine, R»Arginine, 
S=Serine, T~Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NIPVSQVNPRDAKACVIHGTDLKDFTSEQIDEILQNHTEIVFAR 
TSPQQKLI IVEGCQRQGAIVAVTGDGVNDSPALKKADIGVAMGI 
AGSDVS KQAADMI LLDDNFAS I VTGVEEGRLIFDNLKKS IAYTL 
TSNIPEITPFLLPIMANIPLPLGTITILCIDLGTDMVPAISLAY 
EAAESDIMKRQPRNPRTDKLWERLISMAYGQIGMIQALGGFFS 
YFVILAENGFLPGNLVGIRLNWDDRTVNDLEDSYGQQWTYEQRK 
WE FTCHTAFFVS I VWQWADL 1 1 CKTRRNS VFQQGMKNKIL I F 
G L FE ET ALA AFLS YC PG MDVALRM YPL K P S W WFCAFP YS FL I FV 
YDEIRKLILRRNPGGWVEKETYY 


5867 


3 


1485 


LPGRRARGGRGLGWPPAQALDGSRMGKAKVPASKRAPSSPVAKP 
GPVKTLTRKKNKKKKRFWKSKAREVSKKPASGPGAWRPPKAPE 
DFSQNWKALQEWLLKQKSQAPEKPLVISQMGSKKKPKIIQQNKK 
ETSPQVKGEEMPAGKDQEASRGSVPSGSKMDRRAPVPRTKASGT 
EHNKKGTKERTNGDIVPERGDIEHKKRKAK\GQPQPHPPR/IDI 
WFDDVDPADIEAAIGPEAAKIARKQLGQSEGSVSLSLVKEQAFG 
GLTRALALDCEMVGVGPKGEESMAARVSIVNQYGKCVYDKYVKP 
TEPVTDYRTAVSGIRPENLKQGEELEWQKEVAEMLKGRILVGH 
ALHNDLKVLFLDHPKKKIRDTQKYKPFKSQVKSGRPSLRLLSEK 
I LGLQVQQAEHCS IQDAQAAMRL YVMVKKEWESMARDRRPLLTA 
PDHCSDDA*QSCPAAAAAPLQRQCDQSQGQITSPQSGNSGETFS 
ESWQRGVAWCY 


5868 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/CVTNAMREDLADIWYIR 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TE RS AFTE RD AG S GL VTRLRERPALLVS STSWTEDEDFSI LLAA 
LESRV*T\MTLDGHNLPSLVCVITGKGPLREYYSRLIKQKHFQH 
IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMKWDMFG 
CC L P VCAVN FKC LHELVKHEENG L VFEDS E ELAAQLQML FSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5869 


2122 


833 


LTAG AS H TQ D AS Q S TS AK YP AAAQ N L / C VTNAMRE DLAD I W Y I R 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRERPALLVSSTSWTEDEDFS I LLAA 
LESRV*T\MTLDGHNLPSLVCVITGKGPLREYYSRLIHQKHFQH 
IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMKVVDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT 


5870 


2122 


833 


LTAGASHTQDASQSTSAKYPAAAQNL/CVTNAMREDLADlWYIR ' 
AVTVYDKPASFFKETPLDLQHRLFMKLGSMHSPFRARSEPEDPV 
TERSAFTERDAGSGLVTRLRERPALLVS STSWTEDEDFSI LLAA 
LESR V * T \ MTLDGHNL PS L VCVI TG KG P LRE Y YS RL I HQKH FQH 
IQVCTPWLEAEDYPLLLGSADLGVCLHTSSSGLDLPMKWDMFG 
CCLPVCAVNFKCLHELVKHEENGLVFEDSEELAAQLQMLFSNFP 
DPAGKLNQFRKNLRESQQLRWDESWVQTVLPLVMDT ; 


5871 


3 


3465 


FFFCRPLRLYSKTTGDRSAMAGAAGLTAEVSWKVLERRARTKRS 
VLKLL* LSLRRL*LEPTI *NGLLT*CSRLS VFRFLKV\GSVYEP 
LKSINLPRPDNETLWDKLDHYYRIVKSTLLLYQSPTTGLFPTKT 
CGGDQ KAK I QDSLYCAAGAWALALAYRR I DDDKGRTHELEHS A I 
KCMRGILYCYMRQADKVQQFKQDPRPTTCLHSVFNVHTGDELLS 
YEEYGHLQINAVSLYLLYLVEMISSGLQIIYNTDEVSFIQNLVF 
CV\ERVYRVP\DFG\VWGKREGKYY*/SGSTELHSSSVGLGKRQ 
L* KQFNGFNLFGNQGCSWSVI FVDLDAHNRNRQTLCSLLPRESR 
SHNTDAALL PCI S YPAFALDDE VLFSQTLDKWRKLKGKYGFKR 
FLRDG YRTS LED PNRC Y YKPAE I KLFDG I E CE FP I FFL YMM IDG 
VFRGNPKQVQEYQDLLTPVLHHTTEGYPWPKYYYVPADFVEYE 
KNNPGSQKRFPSNCGRDGKLFLWGQALYIIAKLLADELISPKDI 
DPVQRYVPLKDQRNVSMRFSNQGPLETOLWHVALIAESQRLQV 
FLNTYGI QTQTPQQVEP I QI WPQQELVKAYLQLGINEKLGLSGR 
PDRPIGCLGTSKIYRILGKTWCYPIIFDLSDFYMSQDVFLLID 
D I KNALQ F I KQ YWKMHGR PLFLVL I REDNI RGSRFNP I LDMLAA 
LKKGIIGGVKVHVDRLQTLISGAWEQLDFLRISDTEELPEFKS 
FEELEPPKHSKVKRQSSTPSAPELGQQPDVNISEWKDKPTHEIL 
QKLNDCSCLASQAILLGILLKREGPNFITKEGTVSDHIERVYRR 
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P=Proline, Q=Glutamine, R^Arginine, 
S=Serine / ^Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








AGSQKLWSWRRAASLLSKWDSLAPSITNVLVQGKQVTLGAFG 
HEEEVISNPLSPRVIQNIIYYKCNTHDEREAVIQQELVIHIGWI 
ISNNPELFSGTLKIRIGWIIHAMEYELQIRGGDKPALDLYQLSP 
bLVKQLLL»DI LQ PQQNGRC WLNRRQ I DG SLNRTPTG FYDRVWQ 1 
LERTPNGIIVAGKHLPQQPTLSDMTMYEMNFSLLVEDTLGNIDQ 
PQYRQ I WELLM WS I VLERNPELE FQDKVDLDRLVKEAFNEFQ 
KDQSRLKEIEKQDDMTSFYNTPPLGPCRGTCSYLTKAVMNLLLEG 
EVKPNNDDPCLIS 


5872 


66 


665 


VQGYMYRFVIKINSCYSEKTSICRHRCCPELPATQPWPTPTVFF 
NIAIDSESLGCI\SFKLFADKV/PKRWKKNFVLLNTGEKVLGDK 
GPCFYRIIPG\L cqggd fthhngtggks l YS KE FDDENF I / LKH 
TAPG VLSTANAG PTTNGS Q FF I CTAKTEDG * QHWFGKVKDGMS 
IVEALERSGSRNGKTSKKITAANCGQL 


5873 


2240 


506 


RRPPEGGSGGGRRTRARMPLPWSLALPLLLSWVAGGFGNAASAR 
HHGLLASARQPGVCHYGTKLACCYGWRRNSKGVCEATCEPGCKF 
GECVGPNKCRCFPGYTGKTCSQDVNECGMKPRPCQHRCVNTHGS 
YKCFCLSGHMLMPDATCVNSRTCAMINCQYSCEDTEEGPQCLiCP 
SSGLRLAPNGRDCLDIDECASGKVICPYNRRCVNTFGSYYCKCH 
IGFELQYISGRYDCIDINECTMDSHTCSHHANCFNTQGSFKCKC 
KQGYKGNGLRCSAI PENS VKEVLRAPGTI KDRI KKLLAHKNSMK 
KKAKIKNVTPEPTRTPTPKVNLQPFNYEEIVSRGGNSHGG\KKG 
NEEKMKEGLEDEKREEKALKD*HRRERPFRG\DVFFPKVNEAGE 
FGLIL\VQRKALTSKLEHKADLNISVDCSFNHG\ICDW\KQDR\ 
EDDFDW\NPADR\DNAI\GFY\MAVPGLWQGHK\KDIGRLKLLL 
PDLQPQSNFCLLFDYRLAGDKVGKLRVFVKNSNNALAWEKTTSE 
DEKWKTGKIQLYQGTDATKS I IFEAERGKGKTGE IAVDGVLLVS 
GLCPDSLLSVDD 


5874' 


2 


3387 


AC P RLARRRRR VRS LRRRRG W LRAR WS RGQNNMAARR I TQE TFD 
AVLQEKAKRYHMDASGEAVSETLQFKAQDLLRAVPRSRAEMYDD 
VHSDGRYSLSGSVAHSRDAGRESLRSDVFSGPSFRSSNPSISDD 
SYFRKECGRDLEFSHSNSRDQVIGHRKLGHFRSQDWKFALRGSW 
EQDFGHPVSQESSWSQEYSFGPSAVLGDFGSSRLIEKECLEKE\ 
S RDYD VDH S G \ E A \ D S VLRGS \ SQ VQ A \ RGRALN I VDQEGS LLG 
. KG ETQG LLTAKGG VG KLVTLRNVS T KK I PT VNR I T P KTQGTNQ I 
QKNTPS PDVTLGTNPGTED I QFPIQKI PLGLDLKNLRLPRRKMS 
FDI IDKSDVFSRFGI EI IKWAGFHTI KDD I KFS Q L FQTLFE LE T 
ETC7U<WIiASFKCSLKPEHRDFCFFTIKFLKHSALKTPRVDNEFL 
NMLLDKGAVKTKNCFFEIIKPFDKYIMRLQDRLLKSVTPLLMAC 
NAYELSVKMKTLSNPLDLALALETTNSLCRKSLALLGQTFSLAS 
SFRQEKIL+AVGLQDIAPSPAAFPNFEDSTLFGREYIDHLKAWL 
VSSGCPLQVKKAEPEPMREEEKMIPPTKPEIQAKAPSSLSDAVP 
QRADHRWGTIDQLVKRVIEGSLSPKERTLLKEDPAYWFLSDEN 
S LE YK Y YKLKLA EMQRMSENLRGADQKP TS ADCAVRAML YSRAV 
RNLKKKLLP\WQRRGLLRAQG\LRG\WKARRA\TTGTQTLLFLR 
APGLKHHGRQAPGLS\QAKPSLPDRND\AAKD\CPLDPV\GPSP 
QDPSLEASGPSPKPAGVDISEAPQTSSPCPSADIDMKDNGRTAE 
KLARFVAQVG\PEIEQF\SI\ENSTDNPDLWFL\HDQNSS\AFK 
FY\RKKVFELCPSICFTSSPHNL\HTGGGDTT\GSQESPVDLME 
GEAEFEDEPPPREAELESPEVMPEEEDEDDEDGGEEAPA\PGRG 
G PS LEGSTPADGLPG EA\ AEDDL/ ALGAPAIiFTGLLQVTCFPFG 
RGFSSKSLKVGMIPAPKRVCLIQEPKVHEPVRIAYDRPRGRPMS 
ruus-ftf ]\vuu e a^qkIj \ TDK \NLGFQ \MLQKMGWKEGHGLGSLGK 
G I R \ S RS AC TQQ AAWGGS GWG LS P S TCS L P LGS FTAKMA YS WQL 
IFVF 


5875 


296 


1848 


LAALGGLPLWRLSRRGFREYLLGLSAPSALGGAMRSVSYVQRVA 
LEFSGSLFPHAICLGDVDNDTLNELWGDTSGKVSVYKNDDSRP 
WLTCSCQGMliTCVGVGDVCNKGKNIiLVAVSAEGWFHLFDLTPAK 
VLDASGHHETLIGEEQRPVFKQHIPANTKVMLISDIDGDGCREL 
WGYTDRWRAFRWEELGEGPEHLTGQLVSLKKWMLEGQVDSLS 
VTLGPLGLtPELMVSQPGCAYAILIiCTWKKDTGSPPASEGPTDGS 
/SGDPSCPRRGAAPDIWPYPQQECLHSPNWQHQT\SHGTESSGS 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=HiBtidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








GLFALCTLDGTLKLMEEMEEADKLLWSVQVDHQLFALEKLDVTG 
NGHEEWACAWDGQTY 1 1 DHNRTWRFQVDENIRAFCAGLYACK 
EGRNSPCLVYVTFNQKIYVYWEVQLERMESTNLVKLLETKP\ST 
TACCRSWAWILTTSL*LVPCFTKRSTIQTSHHSVLPQASRIPPS 
WTCL I AGEG FF* TPTL P PKGVFGSHCAAAGS I TKQ 


5876 


1122 


224 


HLPLGVPSKVAGAAAMEPQEERETQVAAWLKKIFGDHPIPQYEV 
NPRTTE I LHHLS ERNRVRDRD V YLVI EDLKQKASEYES EAKYLQ 
DLLMESVNFSPANLSSTGSRYLNALVDSAVALETKDTSLASFIP 
AVNDLTSDLFRTKSKSEEIKIELEKLEKNLTATLVLEKCLQEDV 
KKAE LH LS T ER \ AKVDNRRQNM \ D FLKAKS E E FRFG I Q AAG EQL 
SARGQ\DAFSVPIQSLVALIRENWPRLKQQTIPLK\KKLESYLD 
LMP\NPSHCSK+RIEEAK\RELA\SIEAELTRRVS\MMEL 


5877 


2030 


1907 


GTLGKMAAS S S GEKEKERLGGGLGVAGGNS TRERLLS ALEDLE V 
LSRELIEMLAISRNQKLLQAGEENQVLELLIHRDGEFQELMKLA 
LNQGKI HHEMQVLEKEVEKRDSD I QQLQKQLKEAEQ I LATAVYQ 
AKE KLKS I E KAR KGA I SSEE 1 1 KYAHR I SASNAVCAPLT WVPGD 
PRRPYPTDLEMRSGLLGQMNNPSTNGVNGHLPGDALA/RRKIAR 
CPCSTVS/NGSQMTCR*INIILILQKSVCEL 


5878 


950 


2113 


GLWKCMQLQGPHTHRVQP*PTPRQQGPQ\VPVAVIAGNRPNYLY 
RMLRSLLSAQGVSPQMITVFIDGYYEEPMDWALFGLRGIQHTP 
I S I KNAR VSQH YKAS LTAT FNL F P E AKFAWLE E DLD I AVD FFS 
FLSQSIHLLEEDDSLYCISAWNDQGYEHTAEDPALLYRVETMPG 
LGWVLRRSLYKEELEPKWPTPEKLWDWDMWMRMPEQRRGRECII 
PDVSRSYHFGIVGLNMNGYFHEAYFKKHKFNTVPGVQLRNVDSL 
KKEAYEVEVHRLLSEAEVLDHSKNPCEDSFLPDTEGHTYVAFIR 
MEKDDDFTTWTQLAKCLHIWDLDVRGNHRGLWRLFRKKNHFLW 
GVPAS P YS VKKPPS VTP I FLE P P P KE EG APGAP E QT 


5879 


3 


981 


RLTEAAAAGSGSRAAGWAGSPPTLLPLSPTSPRCAATMASSDED 
GTNGGASEAGEDREAPGKRRRLGFLATAWLTFYDIAMTAGWLVL 
AIAMVRFYMEKGTHRGLYKS I QKTLKFFQTFALLE I VHCLIGI V 
PTSVIVTGVQVSSRIFMVWLITHSIKPIQNEESWLFLVAWTVT 
EITRYSFYTFSLLDHLPYFIKWARYNFFIILYPVGVAGELLTIY 
AALPHVKKTGMFSIRLPNKYNVSFDYYYFLLITMASYIPLFPQL 
YFHMLRQRRKVLHG\G*L*KRMIK+SLQTRCFFQNNQDYLSPSF 
NNKNKQLCEISWIVWFLKI 


CODfl 
jODU 


113 8 


1324 


SLWCLVAGGLGLGPSSQNPLQRAGILARPREARGTFSALTACSA 
SVTSKGKSSSGMWPSAASDRDSPVPLRPPGPVQLPSGTCWVLSD 
♦KKKRGRCSS/WLSQPQHEREKEVVLLRRSMAEGERARAASDVL 
CRSLANE THQLRRTLTATAHMCQHLAKCLDERQHAQRNVGERS P 
DQSEHTDGHTS VQS V I EKLQE ENRLLKQ KVTHVEDLNAKWQR YN 
ASRDEYVRGLHAQLRGLQIPHEPELMRKEI3RLNRQLEEKINDC 
AEVKQELtAASRTARDAALERVQMLEQQILiAYKDDFMSERADRER 
AQ SRIQELEE KVAS LLHQ VS WRQ DS R E P DAGR I HAGS KTAK YLA 
ADALELMVPGGWRPGTGSQQPEPPAEGGHPGAAQRGQGDLQCPH 
CLQCFSDEQGEELLRHVAECCQ 


5881 


26 


441 


GGIHPSPTEAPRAQHLTMDCTWRILFLVAAATGTHAQVQLLQSG 
SEVKKPGASVMVSCYVSGYTLTKLSMHWVRQAPGKGLE+MGPFD 
LQDVETIYPQKFQGRVSMTEETSTETTQ/AYLELSSIiRSEDTAV 
HHCATDTV 


5882 


2407 


2216 


SGC^MLYSHSLEYNPEWISVQSAVAPAQLALNSDGDL*LHSGE 
RTRRD*QLPEAGGPGLQEPLQLGELDITSDEFILDEVDG\VDLR 
HYSKQVELELQQIEQKSIRDYIQESENIASLHNQITACDAVLER 
MEQMLGAFQSDLSSISSEIRTLQEQSGAMNIRLRNRQAVRGKLG 
E L VDGL W P S ALVTA I LE AP VTE P R FLE QLQELD AKAAAVREQE 
ARGTAACADVRGVLDRLRVKAVTKIREFILQKIYSFRKPMTNYQ 
IPQTALLKYRFFYQFLLGNERATAKEIRDEYVETLSKIYLSYYR 
SYLGRLMKVQYEEVAEKDDLMGVEDTAKKGFFSKPSLRSRNTIF 
TLGTRGSVISPTELEAPILVPHTAQRGEQRYPFEALFRSQHYAL 
LDNSCREYbFICEFFWSGPAAHDLFHAVMGRTLSMTLKHLDSY 
LADC YDA I AVFLC I H I VLRFRNI AAKRDVPALDR YWEQVLALL W 
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Predicted 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
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amino acid 
residue of 
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sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KsLysine, 
L=Leucine, M=Methionine , N«Asparagine , 
P=Proline, Q=Glutamine, R»Arginine, 
o-ociiuc, i — i nreonine , v=vaiine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PRFELILEMNVQSVRS'i'DPQRLGGLDTRPHYITRRYAEFSSALV 
SINQTIPNERTMQLLGQLQVEVENFVLRVAAEFSSRKEQLVPLI 
NNYDMMLGVLM\E*ERAADDSKEVESFQQLLNARTQEFIEEIiLS 
PPFGGLVAFVKEAEALIERGQAERLRGEEARVTQLIRGFGSSWK 
SSVESLSQDVMRSFTNFRNGTSIIQGALTQLIQ\LYHRFHRV\L 
SQPQLRALPARAELINIHHLMVELKKHKPNF \ 


5883 


2 


1374 


E F PGRR FRAVME AGAGAGAGAAG WS C PG PGPT VTTLGSYEASEG 
CERKKGQRWGSLERRGMQAMEGEVLLPALYEEEEEEEEEEEEVE 
EEEEQVQKGGSVGSLSVNKHRGLSLTETELEELRAQVLQLVAEL 
EETRELAGQHEDDSLELQGLLEDERLASAQQAEVFTKQICXSIiQG 
ELRSLREEISLLEHEKESELKEIEQELHLAQAEIQSLRQAAEDS 
ATEHESDIASLQEDLCRMQNELEDMERIRGDYEMEIASLRAEME 
MKSSEPSGSLGLSDYSGLQEELQELRERYHFLNEEYRALQESNS 
SLTGQLADLESERTQRATERWLQSQTLSMTSAESQTSEMDFLEP 
DPEMQLLRQQLRDAEEQMHGMKNKCQELCCELEELQHHRQVSEE 
EQRRLQRELKCAQNEVLRFQTSHS\SPSHPLPPIPPSSPCLL*A 
LWISALLWCWWAETSS 


5884 


4261 


2522 


GVLARASARLRVPLTGVRACAEPEVGAEPAKVAGAAEPDEDGGR 

SRLRDCGDYTPSERLGPKGAKHjWFQGAIPAAIATAKRSGAVFW 

FVAGDDEQSTQMAASWEDDKVTEASSNSFVAIKIDTKSEACLQF 

SQIYPWCVPSSFFIGDSGIPLEVIAGSVSADELVTRIHKVRQM 

HLLKSETSVANGSQSESSVSTPSASFEPNNTCENSQSRNAELCE 

IPSTSDTKSDTATGGESAGHATSSQEPSGCSDQRPAEDLNIRVE 

RLTKKLEERREEKRKEEEQREIKKEIERRKTGKEMLDYKRKQEE 

BLTKRMLEERNREKAEDRAARER I KQQ I ALDRAERAARFAKTKE 

EVEAAKAAALLAKQAEMEVKRESYARERSTVARIQFRLPDGSSF 

TNQFPSDAPLEEARQFAAQTVGNTYGNFSLATMFPRREFTKEDY 

KKKLLDLEItAPSASWLLP/ALFINF*AGRPTASIVHSSSGDIW 

TLLGTVLYPFLAIWRLISNFLFSNPPPTQTSVRVTSSEPPNPAS 

SSKSEKREPVRKRVLEKRGDDFKKEGKIYRLRTQDDGEDENNTW 
NGNSTQQM 


5885 


900 


467 


AAGGGRRSRLSRSWPTGPSKSPSGVRCCG\RR\AWEDKDEFLDV 
IYWFRQIIAWLGVIWGVLPLRGFLGIAGFCLINAGVLYLYFSN 
YLQIDEEEYGGTWELTKEGFMTSFA/IVHGHLDHLLHCHPL*LM 
VYSSQVLPIQSKGPS 


5886 


86 


1 Til 


PFRGRALTLKKQPRPGVAPPSLGTCHKSDPGRPAAQSQPPSPGrS 
GTFGLLSFRMVRTKTWTLKKHFVGYPTNSDFELKTSELPPLKNG 
EVLLEALFLTVDPYMRVAAKRLKEGDTMMGQQVAKWESKNVAL 
PKGT I VLAS PG WTTHS I S DG KDLE KLLT EW P DT I P LS LALGTVG 
MPGLTAYFGLLEICGVKGGETVMVNAAAGAVGSWGQIAKLKGC 
KWGAVGSDEKVAYLQKLGFDWFNYKTVESLEETLKKASPDGY 
DCYFDNVGGEFSNTVIGQMKKFGRIAICGAISTYNRTGPLPPGP 
P PE IG I YQELRMEAF WYRWQGD ARQKALiKDLL KW VLEL P YFV I 
D*LQANTLVYKSMKSAKPSLEYISEKLVSG\KIQYKEYIIEGFE 
NMPAAFMGMLKGDNLGKTIVKA 


5887 


1937 


104 


APGCRGCRATRCPCRGPRWDSLGDEAARSPAAPGGAPGLLGLRE " 
R P DR CH PGGDDRG P Q LHRG SPG/SPSELS RRPG P PGL PGLQGP P 

PAPGLPQSRTL/PVLCVCDLSPAQCDINCCCDPDCSSVDFSVFS 
ACSVPWTGDSQFCSQKAVIYSLNFTANPPQRVFELVDQINPSI 
FCIHITN\*NLHYPLLIQKYL/NENNFDTLMKTSDGFTLNAESY 

VSFTTKLDlPTAAKYEYGVPT.nTCinQPT ppnocr toot r»rrr\xTMr» 

AAFLVNQAVKCTRKINLEQCEEIEALSMAFYSSPEILRVPDSRK 
KVPITVQSIVIQSLNKTLTRREDTDVLQPTLVNAGHFSLCVNW 
LEVKYSLTYTDAGEVTKADLSFVLGTVSSWVPLQQKFEIHFLQ 
E NTQ PVP LS GN"PG Y WGLPLAAG FQPHKGS G I IQTTNR YG QLT I 
LHSTTEQDCLALEGVRTPVLFGYTMQSGCKLRLTGALPCQLVAQ 
KVK S LL WGQG F PD YVAP FGNS QG P / ADM LD W VP I H F I TQS FNRK 
DS CQ LPGAL V I E VKWTKYGS LLN P QAK I VNVTANL ISSS FPEAN 

SGNERTILISTAVTFVDVSAPAEAGFRAPPAINARLPFNFFFPF 
V 
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Predicted end 
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sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=*Asparagine, 
P=Proline, Q^Glutamine , R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
w=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, * 
\=possible nucleotide insertion) 


5888 


375 


; 2302 


LLCRTPGVAMQRADSEQPSKRPRCDDSPRTPSNTPSAEADWSPG~ 

LELHPDYKTWGPEQVCSFLRRGGFEEPVLLKNIRENEITGALLP 

CLDESRFENLGVSSLGERKKLLSYIQRLVQIHVDTMKVINDPIH 

GHIELHPLLVRIIDTPQFQRLRYIKQLGGGYYVFPGASHNRFEH 

S LGVG YLAGCLVHALGE KQ PELQ I S ERD VLC VQ I AGLCHDLGHG 

P FSHMFDGRFI PLARPEVKWTHEQGS VMMFEHLINSNG I KPVME 

QYGLIPEEDICFIKEQIVGPLESPVEDSLWPYKGRPENKSFLYE 

IVSNKRNGIDVDKWDYFARDCHHLGIQNNFDYKRFIKFARVCEV 

DNELRICARDKEVGNLYDMFHTRNSLHRRAYQHKVGNIIDTMIT 

DAFLKADDYlEirGAGGKKYRISTAIDDMEAYTKLTDNIFLEIL 

YSTDPKLKDAREILKQIEYRNLFKYVGETQPTGQIKIKREDYES 

LPKEVASAKPKVLLDVKLKAEDFIVDVINMDYGMQEKNPIDHVS 

FYCKTAPNRAIRITKNQVSQLLP\EKFAEQ\LIRVYCKKVDRKS 

LYA\ARQYFVQW\CADR\NFT\KPQDGRCY*PPTP*HPQKKGW\ 

NDSTFSPKIPTRLPRRLPKSRV\QLFKDDPM 


5889 


1831 


731 


L P AACGR P VTAR P RQAPEGR SGR PRDLDP YPPQ VF PPR PDR VAI 
VTGGTDGIGYSTAKHLARLGMHVIIAGNNDSKAKQWSKIKEET 
LND KE T * VLL CC PG WL CLWNS S D P PTS AS RG AGTTG VHHHFLLK 
FGIFIL\DLASrfTSIRQFVQKFKMKKIPLHVLINNAGVMMVPQR 
KTRDGFEEHFGLNYLGHFLLTNLLLDTXjKESGSPGHSARWTVS 

sathyvaelnmddlqssacysphaayaqskiialvlftyhlqrll 
aaegshvtanwdpc5wntdlykhvfwatrlakkllgwllfktp 
degawtsiyaavtpelegvggrylynkketkslhvtynqklqqq 
lwskscemtgvldvtl 


5890 


1322 


200 


frrgwsaagravpvafcsrisassprrprgavrlqsgteaacrs 
grpdprpasaagghagermsqrdtlvhlfaggcggtvgailtcp 
lewktrlqsssvtlyisevqlntmagasvnrwspgplhclkv 
ilekegprslfrglgpnlvgvapsraiyfaaysnckeklndvfd 
pdstqvhmisaamagftaitatnpiwliktrlql*/sqgtagkr 
rmgafecvrkvyqtdglkgfyrgmsasyagisetvihfviyesi 
kqklleyktastmendeesvkeasdfvgmmlaaatsk\lvattr 

AYPHEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQI P \NTAI MMAT YELWYLLNG 


5891 


1322 


200 


FRRGWSAAGRAVPVAFCSRISASSPRRPRGAVRLQSGTEAACRS 
GRPDPRPASAAGGHAGERMSQRDTLVHLFAGGCGGTVGAILTCP 
LEWKTRLQSSSVTLYISEVQLNTMAGASVNRWSPGPLHCLKV 
ILEKEGPRSLFRGLGPNLVGVAPSRAIYFAAYSNCKEKLNDVFD 
PDSTQVHMISAAMAGFTAITATNPIWLIKTRLQL*/SQGTAGKR 
RMGAFECVRKVYQTDGLKGFYRGMSASYAGISETVIHFVIYESI 
KQKLLE YKTASTMENDE ESVKEASDF VGMMLAAATS K\ LVATT I 
AYPHEWRTRLREEGTKYRSFFQTLSLLVQEEGYGSLYRGLTTH 
LVRQIP\NTAIMMATYELWYLLNG 


5892 


1764 


379 


WLRVCGRLS VNS AVS S RTGG WS AGLTCAMQR£»QWLGH]jR(5^X 
DS G WM P Q AAP CLS GAPHASAADWWHG RRTAI CRAGRGGFKDT 
TPDELLSAVMTAVLKDVNLRPEQLGDICVGNVLQPGAGAIMARI 
AQFLSDIPETVPLSTVNRQCSSGLQAVASIAGGIRNGSYDIGMA 
CGVESMSLADRGNPGNITSRLMEKEKARDCLIPMGITSENVAER 
FG I S REKQDT FALAS QQ KAARAQ S KG C FQ AE I VP VTTT VHDDKG 
TKRSITVTQDEGIRPSTTMEGLAKIiKPAFKKDGSTTAGNSSQVS 
DGAAA I L LARRS KAE ELG LP I LG VLRS YA WG VPPD I MG I G PAY 
A I P VALQ KAG LTVS DVD I FEINE \ AFAS QAAYC VE KLRL P P * EG 
+ T P LGGAS G P * GH P LG LH WGHVQ VI TLAQ * S * S ARG KRA YRSG C 
PCAIGSWNGSPLPVFEYPWGT 


5893 


3 


1653 


I LS KRRCQKAKTKELMAKKVAVIGAGVSGLI SLKCCVDEGLEPT 
CFERTEDIGGVWRFKENVEDGRASIYQSWTNTSKEMSCFSDFP 
MPEDFPNFLHNSKLLEYFRIFAKKFDLLKYIQFQTTVLSVRKCP 
DFSSSGQWKWTQSNGKEQSAVFDAVMVCSGHHILPHIPLKSFP 
GMERFKGQYFHSRQYKHPDGFEGKRILVIGMGNLGSDIAVELSK 
NAAQVFISTRHGTWVMSRISEDGYPWDSVFHTRFRSMLRNVLPR 
TAVKWM I EQQMNRW FNHENYGLE PQNKY IMKE PVLNDDVPS RLL 
CGAI KVKSTVKELTETSAI FEDGTVEENIDVI I FATGYSFS FPF 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PeProline, Q=Glut amine, R=Arginine, 
S=Serine, T»Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEDSLVKVENNMVSLYKYIFPAHLDKSTLACIGLIQPLGSIFPT' 

AELQARWVTRVFKGLCSLPSERTMMMDIIKRNEKRIDLFGESQS 

QTLQT]^a3YLDErj^EIGAKPDFCSLLFKDPKLAVRLYFGPCN 

SY+YRLVGPGQWEGARNAIFTQKQRILKPLKTRALKDSSNFSVS 

FLLKILGLliAWVAFF\CQLQWS 


5894 


174 


1673 


RYSPKKVLQNKESSLKLGMATALVSAHSLAPLNLKKEGLRVVRE 
DHYSTWEQGFKLQGNSKGLGQEPLCKQFRQLRYEETTGPREALS 
RLRELCQQWLQPETHTKEHILELLVLEQFLIILPKELQARVQEH 
HPESREDWVVLEDLQLDLGETGQQVDPDQPKKQKILVEE^4APL 
KGVQEQQVRHECEVTKPEKEKGEETRIENGKLIWTDSCGRVES 
SGKISEPMEAHNEGSNLERHQAKPKEKIEYKCSEREQRFIQHLD 
LIEHASTHTGKKLCESDVCQSSSLTGHKKVIiS*ERKVlQC\HGV 
LGKAFQRSSHLVRHQKIHLGEKPYQCNECGKVFSQNAGLLEHLR 
IHTGEKPYLCIHCGKNFRRSSHLNRHQRIHSQEEPCECKECX3KT 
FSQALLLTHHQRIHSHSKSHQCNECGKAFSLTSDLIRHHRIHTG 
EKPFKCNICQKAFRLNSHLAQHVRIHNEEKPYQCSECGEAFRQR 
SGLFQHQRYHHKDKLA 


5895 
» 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE " 

MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCWPFLT 

RPKV?VLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 

EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 

RQ\NCPFLAGETESLADIVLWGALYPLLQDPAYLPEELSALHSW 

FQTLSTQ\EPCQR\AARRLV1iKQ\QGVLALR\PYLQKQPQPSPA 

egkglspiepeeeelatlseeeiamavtawekgleslpplrpqq 
npvlpvagernvlitsalpyvnnvphlgniigcvlsadvfarys 
rlrqwntlylcgtdeygtatetkal\eegltpqeicdkyiiiiha 
diy\rwfnisfdifgrtttpqq\tkit\qdifqqllkrgfvlqd 
tveqlrcehcarf\ladrfvegvcpfcgyeeargdqcdkcgkli 
navelkkpqckvcrscpwqssqhlfldlpklekrlebwlgrtl 
pgsdwtpnaqfitpffgfrewpskprwq*trdlk\wgnpgtp*e 
gfedk\vfyvwfdatigylsitanytdqwerww\knpeqvdlyq 
fm\akdnvpfhslvfpssalgaednytl\vshlxateylnyedg 
k\fsksrgvgvfrdm\ahdtgippdisrfyl\lyirpegk\dsa 
fswtdlllknns\ellnnlgnfinra\gmfvskffgg\yvpemv 
ltpddqrlla\ hvtlelqh yhq\llekvr irdalrs ilt i s \ rh 
gnqyi\qvnepw\krikgseadrqragtvtglavniat^llsvml 
qpymptvsatiqaqlqlpppacsilltnflctlpaghqigtvsp 
lfqklendqieslrqrfgggqaktspkpawetvttakpqqiqa 
lmdevtkqgnivrelkaqkadknevaaevaklldlkkqlavaeg 
kppeapkgkkkk 


5896 


2967 


86 


hpsllgaipfypppsspwppplylfwnshrksrhfinqrgihge 
mrl fvsdgvpgclp vlaaagrargrae vli stvg pedcwp flt 
rpkvpvlqldsgnylfstsaicryff\llsgweqddltnqwlew 

EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ\NCPFLAGETESLADIVLWGALYPLLQDPAYLPEELSALHSW 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVliALR\PYLQKQPQPSPA 
EGKGLSPIEPEEEELATLSEEEIAMAVTAWEKGLESLPPLRPQQ 
NPVLPVAGERNVLITSALPYVNNVPHLGNIIGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 
DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGK3jI 
NAVEL KKPQCKVCRS CP WQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK\VFYVWFDATIGYLS ITANYTDQWERWW \ KNPEQVDLYQ 
FM\AKDNVPFHSLVFPSSALGAEDNYTL\VSHLIATEYLNYEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FSWTDLLLKNNS\ELLNNLGNFINRA\GMFVSKFFGG\YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 
GNQYI \ QVNEP W\ KR IKGS EADRQRAGTVTGLAVNI AALLS VML 
Q P YM P TVS AT I QAQLQLP P PACS I LLTN F LCTL PAGHQ I GTVS P 
L FQKLENDQ I E S LRQRFGGGQAKTS P KP AWET VTTAKPQQ I QA 
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Predicted end 
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Amino acid segment containing signal peptide 
(A=»Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine , R=Arginine, 
S^Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +-Stop 
Codon, /^possible nucleotide deletion, 
\~possible nucleotide insertion) 








LMDEVTKQGNIVRELKAQKADKNEVAAEVAKLbDLKKQLAVAEG 
KPPEAPKGKKKK 


5897 


2967 


86 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCWPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDIiTNQWLEW 
EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ \NC P FLAG ETE S LAD I VL WG AL YPLLQD PA YL PEE L S ALH S W 

fqtlstqXepcqrxaarrlxtckqXqgvlalrvpylqkqpqpspa 

EGKGLSPIEPEEEELATLSEEEIAMAVTAWEKGLESLPPLRPQQ 

NPVLPVAGERNVLITSALPYVNNVPHLGNIIGCVLSADVFARYS 

RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 

D I Y\ RW FN I S FD I FGRTTTPQQ \ TKI T \ QD I FQQLLKRGFVLQD 

TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLI 

NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL . 

PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 

GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 

FM\AKDNVP FHSLVFPSS ALGAE DNYTL\VSHLIATEYLNYEDG 

K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LY1RPEGK\DSA 

FS WTDLLL KNNS \ ELLNNLGNF I NRA\GMFVS KFFGG \ YVPEMV 

LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 

GNQYI\QVNEPW\KRIKGSEADRQRAGTVTGLAVNIAALLSVML 

QPYMPTVSATIQAQLQLPPPACSILLTNFLCTLPAGHQIGTVSP 

LFQKLENDQIESLRQRFGGGQAKTSPKPAWETVTTAKPQQIQA 

LMDEVTKQGNIVRELKAQKADKNEVAAEVAKLLDLKKQLAVAEG 

KPPEAPKGKKKK 


5898 


2967 


66 


HPSLLGAIPFYPPPSSPWPPPLYLFWNSHRKSRHFINQRGIHGE 
MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCWPFLT 
RPKVPVLQLDSGNYLFSTSAICRYFF\LLSGWEQDDLTNQWLEW 
EATELQPTLSAALYYL\WQGKKG\EDVLGSVRRTLTHIDHSLS 
RQ\NCPFLAGETESLADIVLWGALYPLLQDPAYLPEELSALHSW 
FQTLSTQ\EPCQR\AARRLVLKQ\QGVLALR\PYLQKQPQPSPA 
EGKGLSPIEPEEEELATLSEEEIAMAVTAWEKGLESLPPLRPQQ 
NPVLPVAGERNVLI TSALPYVNNVPHLGNI IGCVLSADVFARYS 
RLRQWNTLYLCGTDEYGTATETKAL\EEGLTPQEICDKYHIIHA 
DIY\RWFNISFDIFGRTTTPQQ\TKIT\QDIFQQLLKRGFVLQD 
TVEQLRCEHCARF\LADRFVEGVCPFCGYEEARGDQCDKCGKLI 
NAVELKKPQCKVCRSCPWQSSQHLFLDLPKLEKRLEEWLGRTL 
PGSDWTPNAQFITPFFGFREWPSKPRWQ*TRDLK\WGNPGTP*E 
GFEDK\VFYVWFDATIGYLSITANYTDQWERWW\KNPEQVDLYQ 
FM \ AKDNVP FHS LVF PS S ALGAEDN YTL \ VS HL I ATE YLN YEDG 
K\FSKSRGVGVFRDM\AHDTGIPPDISRFYL\LYIRPEGK\DSA 
FS WTDLLLKNNS \ELLNNLGN FI NRA\GMFVS KFFGG \ YVPEMV 
LTPDDQRLLA\HVTLELQHYHQ\LLEKVRIRDALRSILTIS\RH 
GNQ Y I \Q VNE P W \ KR I KG S EADRQRAGTVTGLAVN I AALLS VML 
Q P YM PTVS AT I QAQLQL P P P ACS 1 LLTN FLCTL PAGHQ I GT VS P 
LFQKIiENDQlESLRQRFGGGQAKTSPKPAWETVTTAKPQQIQA 
LMDEVTKQGNIVRELKAQKADKNEVAAEVAKLLDLKKQLAVAEG 
KPPEAPKGKKKK 


5899 


326 


1078 


NCPKSKEPNGVRAPSLPSPLRAAMALSDVDVKKQIKHMMAFIEQ 
EANEKAEE I DAKAEBE FN I EKGRLVQTQRLKIME YYEKKEKQI E 
QQKKILMSTMRNQARLKVLRARNDLISDLLSEAKLRLSRIVEDP 
EVYQGLLDKLVLQGLLRLLEPVMIVRCRP\QDLLLVEAAVQKAI 
PEYMTISQKHVEV\QIDKEA*LAVECSWEVWEVYSGNQRIKVSN 
TLESRLDLSAKQKMPEIRMALFGANTNRKFFI 


5900 


64 


1409 


KAASRDSPCLEFCPLCGVSSHDLQHRMWYHRLSHLHSRLQDLLK 
GG V I Y PAL P Q PN FKS L L PLAVH WHHTAS KS LTCAWQQHE DHFEL 
KYANT VMR FD YVWLRDHCRSASCYNS KTHQRSLDTAS VDLC I KP 
KTIRLDETTLFFTWPDGHVTKYDLNWLVKNSYEGQKQKVIQPRI 
LWNAE I YQQAQVPS VDCQS FLETNEGLKKFLQNFLLYGI AFVEN 
VPPTQEHTEKLAERISLIRETIYGRMWYFTSDFSRGDTAYTKLA 
LDRHTDTTYFQEPCGIQVFHCLKHEGTGGRTLLVDGFYAAEQVL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=*Arginine, 
S=Serine, T*= Threonine, V«Valine, 
W»Tryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /«=possible nucleotide deletion, 
\=possible nucleotide insertion) 








Q KAP EE FE LLS KS A I \ KHE Y I E DVGE CHQ PHDWDWAQS * I S THG 
/YKELY1.IRYNNYDRAVINTVPYDWHRWYTAHRTLTIELRRPE 
NE FWVKLK PGR VLF I DNWRVLHGREC FTG YRQLCG C YLTRDD VL 
NTARLLGLQA 


5901 


1 


2121 


vai eqtslkmmqavggaparptgey i cnqcgakyts ldsfqthl 
kthldtvlpkltcpqcnkefpnqesllkhvtihfmitstyyice 
scdkqftsvddlqkhlldmhtfvffrctlcqbvfdskvsiqlhl 
\avtchsnekkvyrctscnwdfrnetdlqlhvkhnhlenOgkvhk 
c i fcge s fgtevelq ch i tth s kkync k f cs kafha 1 1 lle khl 
rekhcvfetktpncgtngaseqvqkeevelqtlltnsqeshnsh 
dgseedvdtsepmygcdicgaaytmetllqnhqlrdhnirpges 
aivkkkaelikgnykcnvcsrtffsenglrehmqthlgpvkhym 
cpicxserfpslltltehkvthsksldtgncrickmplqseebfl 
ehcqmhpdlrnsltgfrcwcmqtvtstlelkihgtfhmqktgn 
gsavqttgrgqhvqklykcasclkefrskqdlvkldinglpygl 
cagcvnls ks as pginvp pgtnrpglgqnenls ai eg kgkvggl 

KTRCS*LATFKF+VLKVELPEPHPKPFHRGVSRPDSNSTQLKTP 
Q VS PMPRISP SQSDEKKTYQC I KCQMVFYNEWD IQVHVANHM I D 
EGLNHECKLCSQTFDSPAKLQCHLrEHSFEGMGGTFKCPVCFTV 
FVQANKLQQHIFSAHGQEDKIYDCTQCPQKFFFQTELQNHTMTQ 
HSS 


5902 


712 


209 


LKNRRRSRPSIRQSIGSTSVSRWLTSLFTYLDHTADVQ*V*REF 
IPLKPRQ*ED*MFQSWLHAWGDTLEEAFEQCAMAMFGYMTDTGT 
VEPLQTVEVETQGDDLQSLLFHFLDEWLYKFSADEFFIP\GWGE 
EFSLSKHPQGTEVKAITYSAMQVYNEENPEVFVI IDI 


5903 


2106 


735 


DTPGPSLPSTTAPFSLRSLSFPSRPSYLLPGDPQPLQGRGLPTT 
PALFALSAVPGGAASPMPPSGLRLLPLLLPLLWLLVLTPGRPAA 
GLSTCKTIDMELVKRKRIEAIRGQILSKLRLASPPSQGEVPPGP 
LPE AVLALYNS TRDRVAG ESAEPE PEPEADYYAKEVTRVLM VET 
HNE I YDKFKQSTHS I YMF FNTSELREAVPE P VLLSRAELRLLRL 
KLKVEQHVEL YQKYSNNS WRYLSNRLLAPSDS PE WLS FDVTGW 
RQWLSRGGEIEGFRLSAHCSCDSRDNTLQVDINGFTTGR\RGDL 
ATIHGMNRPFLLLMATPLERAQHLQS\SRHRQAL\DTNY\CFSF 
HGGRNCLRC/VHC*HLIFRKDL\GW\KWI\HE\PKGYHANFC\L 
GPCPYIWSLDTQYSKVLALYNQ\HKPG\ASAAP\CCVPQALEP\ 
LPIVYY\VGRKPKVEQLSNMIVRSCKCS 


5904 


3 


1126 


MMEEIENAINTFKEEQRLIYEELIKEEKTTNNELSAISRKIDTW 
ALGNSETEKAFRAISSKVPVDKVTPSTLPEEVLDFEKFLQQTGG 
RQGAWDDYDHQNFVKVRNKHKGKPTFMEEVLEHLPGKTQDEVQQ 
HEKWYQKFLALEERKKESIQIWKTKKQQKREEIFKLKEKADNTP 
VLFHNKQEDNQKQKEEQRKKQKLAVEAWKKQKS I EMSMKCASQL 
KEEEEKEKKHQKERQRQFKLKLLIiESYTQQKKEQEEFLRLEKEI 
REKAEKAEKRKNAADE I S RFQERDLH KLEL KI LDRQAKEDEKSQ 
KQRRLAKLKEKVENNVSRDPSRLY/NTHQRLGRTNQKDRTNRLW 
ATSTYPT*GYSNLETRNTEKSMR 


5905 


287 


2912 


MAS FPPR VNEKE I VRLRT IGELLAPAAPFDKKCGRENWTVAFAP 
DGSYFAWSQGHRTVKLVPWSQCLQNFLLHGTKNVTNSSSLRLPR 
QNSDGGQKNKPREHIIDCGDIVWSLAFGSSVPEKQSRCVNIEWH 
RFR FGQDQLL LATGLNSGR I KI WD VYTG KLLLNLVDHTG VVRDL 
TFAPDGSLILVSASRDKTLRVWDLRDDGN\MMKVLRGHQNWVY\ 
SCAFSPDSSMLCSVGASKAWAAILV* LRLCWHHSHTGATMVLS 
WAERVASLATGLGATFT IG * SNLAFVLQGVLY VHRCWSMSTFCF 
SFFLFFFFKVISPTVKYH*LLSKLIFQFYGIGSLTSETNLM*SI 
WLSNGFSVLFFG I LSDS RDI LRL* FNLKFVLI FF* K* CIVS VQK 
KKKPKRIALLQEERLS*DKPPSSHLI*QTEVNIRILFRAILHS* 
LLIFRI*NCI*TYS*IIDPFYIQMTYDRG*FGKNKMVKF*FIEM 
* L Y YFHK I AFS FCNW * HP C CLPKKFHLAVN I LFACS I CFS S *A 
Q VG DPS LL* TSD Y LKGR CQ W SNNLLTLRFLS VYF F KNL WS G KK 
REGGL*YLTLFISVYFS*LVFGINGFQYSF\/VKLHCLYFMFRLI 
FKLTFNRNI*NRICMSALINLKTDFNLTMTLSIFFKLLIIYNA* 
YNLN* I *QF* YKMCHFVLCMSE*SYNICLFIAGF\LWNMDKYTM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid # F^Phenyl alanine, G-Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L=Leucine , M»Methionine, N=Asparagine , 
P«Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=»possible nucleotide insertion) 








IRKLEGHHHDWACDFSPDGALLATASYDTRVYIWDPHNGDILM 
E FGHLFP P PTP I FAGGANDRWVRS VS FSHDGLHVASLADDKMVR 
F WR I DED YPVQVAPLSNGLCCAFS TDGS VLAAGTHDGS VYFWAT 
PRQVPSLQHLCRMS IRRVMPTQEVQELPI PSKLLEFLS YRI 


5906 


146 


2038 


REGAGSGRMASGANYNPYIEIIEQPRQRGMRFRYKCEGRSAGSI 
PGEHSTDNWRTYPSIQIMNYYGKGKV\RITLVTK\NDPYKPHPH 
DL VG KDCR D \ G YYE AE FGQE\RRP\ LF FQN \ LG I RC VKKKEVKE 
A \ 1 1 TR \ I KAG I N P FD VP * KQLND I ED CD LD WRLW FR VFL PDG 
HGNL \ TTALPPV\VSS P I YDNRAPNTAELR VCR VNKNCGS VRGG 
DE I FLLCDKVQKDDIEVRFVLNDWEAKG 1 FSQADVHRQVAI VFK 
TPPYCKAITEPVTVKMQLRRPSDQEVSESMDFRYLPDEKDTYGN 
KAKKQKTTLLFQKLCQDHVETGFRHVDQDGLELLTSGDPPTLAS 
QSAGITVNFPERPRPGLLGSIGEGRYFKKEPNLFSHDAWREMP 
TGVSSQAESYYPSPGPISSGLSHHASMAPLPSSSWSSVAHPTPR 
SGNTNPLSSFSTRTLPSNSQGIPPFLRIPVGNDLNASNACIYNN 
ADDIVGMEASSMPSADLYGISDPNMLSNCSVNMMTTSSDSMGET 
DNPRLLSMNLENPSCNSVLDPRDLRQLHQMSSSSMSAGANSNTT 
VFVSQSDAFEGSDFSCADNSMINESGPSNSTNPNSHVFVQDSQY 
SG IGSMQNEQLSDS FP YEFFQV 


5307 


99 


1873 


TYLLSSWSS**NLDTKIKSQVKl//RKGHKKISWPYPQPAKQNGK 
KATSKVPSAPHFVHPNDHANREAELKKKWVEEMREKQQAAREQE 
RQKRRTIESYCQDVLRRQEEFEHKEEVLQELNMFPQLDDEATRK 
AYYKEFRKWEYSDVILEVLDARDPLGCRCFQMEEAVLRAQGNK 
KLVLVIiNKIDLVPKEWEKWliDYLRNELPTVAFKASTQHQVKNL 
NRCSVPVDQASESLLKSKACFGAENLMRVLGNYCRLGEVRTHIR 
VGWGLPNVGKSSLINSLKRSRACSVGAVPGITKFMQEVYLDKF 
I R L LDAPG I VPGPNS E VGT I LRNCVHVQKLADP VTP VET I LQRC 
NLEEISNYYGVSGFQTTEHFLTAVAHRLGKKKKGGLYSQEQAAK 
AVLADWVSGKISFYIPPPATHTLPTHLSAEIVKEMTEVFDIEDT 
EQANEDTMECLATGESDELLGDTDPLEMEIKLLHSPMTKIADAI 
ENKTTVYKIGDLTGYCTNPNRHQMGWAKRNVDHRPKSNSMVDVC 
SVDRRSVLQRIMETDPLQQGQALASALKNKKKMQKRADKIASKL 
SDSMMSALDLSGNADDGVGD 


5908 


247 


975 


HCGIKKRGEGSGSPSPASGGFQLGCQIPEPSLPSEEETHPHTRA 
HTRTLRATLTRRPPRSHSTRLRFPMPLDGDGGLASWK/PMRER* 
GWRRPAKAAGASLGVAATGKRGCRMSKRYLQKATKGKLLI I I FI 
VTLWGKWSSANHHKAHHVKTGTCEWALHRCCNKNKIEERSQT 
VKCSCFPGQVAGTTRAAPSCVDASIVEQKWWCHMQPCLEGEECK 
VLPDRKGWSCSSGNKVKTTRVTH 


5909 


1 


5002 


PAIPGSTIIWAPGSHSAARADGRHGSIjPSQSQAPGALCGARAPP 
S SNLRADR SMI CAQ ARAG KNL YHNR FLGLA AMAF PS RNSQS LRR 
CKEPIRYSYNPDQFHNMDLRGGPHDGVTIPRSTSDTDLVTSDSR 
STLMGRSSYYSIGHSQDLVIHWDIKEEVDAGDWIGMYLIDEVLS 
ENFLDYKNRGVNGSHRGQIIWKIDASSYFVEPETKICFKYYHGV 
SGALRATTPSVTVKNSAAPIFKSIGADETVQGQGSRRLISFSLS 
DFQAMGLKKGMFFNPDPYLKISIQPGKHSIFPALPHHGQERRSK 
I IGNTVNP I WQAEQFS FVSLPTDVLE I EVKDKFAKSRP I IKRFL 
GKLSMPVQRLLERHAIGDRWSYTLGRRLPTDHVSGQLQFRFEI 
TSSIHPDDEEISLSTEPESAQIQDSPMNNLMESGSGEPRSEAPE 
SSESWKPEQLGEGSVPDRPGNQSIELSRPAEEAAVITEAGDQGM 
VSVGPEGAGELLAQVQKDIQPAPSAEELAEQLDLGEEASALLLE 
DGEAPASTKEEPLEEEATTQSRAGREEEEKEQEEEGDVSTLEQG 
EGRLQLRASVKRKSRPCSLPVSELETVIASACGDPETPRTHYIR 
IHTLLHSMPSAQGGSAAEEEDGAEEESTLKDSSEKDGLSEVDTV 
AADPSALEEDREEPEGATPGTAHPGHSGGHFPSLANGAAQDGDT 
HPSTGSESDSSPRQGGDHSCEGCDASCCSPSCYSSSCYSTSCYS 
SSCYSASCYSPSCYNGWRFASHTRFSSVDSAKISESTVFSSQDD 
EEEENSAFESVPDSMQSPELDPESTNGAGPWQDELAAPSGHVER 
SPEGLESPVAGPSNRREGECPILHNSQPVSQLPSLRPEHHHYPT 
IDEPLPPNWEARIDSHGRVFYVDHVNRTTTWQRPTAAATPDGMR 
RSGSIQQMEQLNRRYQNIQRTIATERSEEDSGSQSCEQAPAGGG 
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SEQ 
1 ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence • 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I«Isoleucine, K^Lysine, 
L«Leucine, M=Methionine, N=Asparagxne , 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T« Threonine, V=Valine, 
W=Tryptcphan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GGGGSDSEAESSQSSLDLRREGSLSPVNSQKITLLLQSPAVKFI~ 

TN P E F FT VLHAN YS A YR VFTS S T CLKHM I LKVRRDARNFER YQH 

NRDLVNFINMFADTRLELPRGWEIKTDQQGKSFFVDHNSRATTF 

IDPRIPLQNGRLPNHLTHRQHLQRLRSYSAGEASEVSRNRGASL 

LARPGHSLVAAIRSQHQHESLPLAYNDKIVAFLRQPNIFEMLQE 

RQPSLARNHTLREKIHYIRTEGNHGLEKLSCDADLVILLSLFEE 

E I MS YVP LQAAFH PG YS FS P R CS P CS S PQNS PGLQ RAS ARAP S P 

YRRDFEAKLRNFYRKLEAKGFGQGPGKIKLIIRRDHLLEGTFNQ 

VMAYSRKELQRNKIaYVTFVGEEGLDYSGPSREFFFLLSQELFNP 

YYGLFEYSANDTYTVQISPMSAFVENHLEWFRFSGRILG\LALI 

HQYLLDAFFT\RPFYKALL\RLPC\D\LSDLEYLDEEFHQSLQW 

MKDNN I TD I LDLT FT VNEE VFGQVTERE LKSGGANTQVTEKNKK 

EYIERMVKWRVERGWQQTEAIiVRGFYEWDSRLVSVFDAREIiE 

L V I AGTAE I DLNDWRNNTE YRGG YHDGHLVIRW FWAAVER FNNE 

QRLRLLQFVTGTSSVPYEGFAAPPWEPMGLRRFLP*KKWGKITS 

LPPRG\HTCLQPDWDLPTVSPRTPMLYEK\LLTA\VEETSTFGT 


5910 


1526 


446 


VAEFAAMEPGRTQIKLDPRYTADLLEVLKTNYGIPSACFSQPPT 
AAQLLRALGPVELALTS ILTLLALGS IAI FLE DAVYL YKNTLCP 
I KRRTLLWKSSAPTWSVLCCFGLWI PRSLVLVEMTITSFYAVC 
FYLLMliVMVEGFGGKEAVLRTLRDTPMMVHTGPCCCCCPCCPRL 
LLTRKKLQ \R* CWALSNTPS * R * R * P WWACFS S PTASMTQQTFL 
RGAQLYGSTLSSA/ CSTLLALWTLGI ISRQARLHLGEQNMGAKF 
ALFQ VL L I LTALQ P S I FS VLANGGQ IACSPPYSS KTRS Q VMNCH 

LLILETFLMTVLTRMYYRRKDHKVGYETFSSPDLDLNLKALRWM 
AWTMKGCCTH 


5911 


109 


5 95 


QLPLAPCIQGKGLEMRSPKPQSFIIRSSHSGAGLLVKNPSTPVF " 
CGHRRGGAAFKYKPTPWGPEQRPTGQKHMRGGVSLLSPRLECS' 
GTI S AHCNLRLPS S SNS PAPAS * LAG I TG VCHHAQL I F VFL VET 
GFHHVGQAGLELL/NWIHLPRPPKVLGLQA 


5912 


924 


277 


MILNKALMLGALALTTVMSPCGGEDIVADHVASYGVNLYQSYGP 
SGQYSHEFDGDEEFYVDLERKETVWQLPLFRRFRRFDPQFALTN 
IAVLKHNLNIVIKRSNSTAATNEVPEVTVFSKSPVTLGQPNTLI 
CLVDNIFPPWNITWLSNGHSVTEGVSETRPSSPKSDHFrjLQDQ 
VTSPSFPFE* * DL* TAKVEQLGAWFEPLIiKHWGAE I PTTL 


5913 


46 


1198 


QLRMAGAEGAAGRQSELEPWStVDVLEEDEELENEACAVLGGS 
DSEKCSYSQGSVKRQALYACSTCTPEGEEPAGICLACSYECHGS 
HKLFELYTKRNFRCDCGNSKFKNLECKLLPDKAKVNSGNKYNDN 
FFGLYCICKRPYPDPEDEIPDEMIQCWCEDWFHGRHLGAIPPE 
S GD FQEMVCQACMKRCS FLWA YAAQIiAVTK I S T \GMMDWCGTLM 
E * / DDQE VI KPENG BHQDS TL KED VPEQG KDD VRE VKVEQNS E P 
CAGSSSESDLQTVFKNESLNAESKSGCKLQELKAKQLIKKDTAT 
YWPLNWRSKLCTCQDCMKMYGDLDVLFLTDEYDTVLAYENKGKI 
AQATDRSDPLMDTLS SMNR VQQVEL I C/G I Q * FED 


5914 


960 


124 


NLGGSELPPEEALFIQVASMNQRRVDFYLASIEDMLVAI/GGRN " 

ENGALSSVETYSPKTDSWSYVAGLPRFTYGHAGTIYKDFVYISG 

GHDYQIGPYRKNLLCYDHRTDVWEERRPMTTARGWHSMCSLGDS 

IYSIGGSDDNIESMERFDVLGVEAYSPQCNQWTRVAPLLHANSE 

SGVAVWEGRIYILGGYSWENTAFSKTVQVYDREADKWSRGVDLP 

KAIAGGSACFIAP+SLGQRTRKRKAKARGTRTGASDPSCASWDH 

PHRHLPGLCRPAATS 


5915 


1604 


703 


FPGRPTRPLKLGRRRKRARIIQAPHCHSPRPRTCPPGALQAPEA 
PASRAEGPVAVWNGHTEGPAPARSAPKEPPGLPRPLGSFPCPT 
PQEDFPALGGPCPPRMPPSPGFSAWLLKGTPPPPPPGLVPPIS 
KPPPGFSGLLPSPHP\PVSPAPPPPPPQK/RPRLLPAP/PGLPS 
PRELPGEEPSAHPVHQGLPAERRGPLQRVQEPLRGVQTGPDLRS 
PVLQELPGPAGGE FPEGL * *AAGPAAH 


. 5916 


256 


633 


SPRMWEIWGPWHRWESFSLEGEWPSRIPEPSPDSTKGTSGKGCR 
TVTGAVHRHLNHVAGI I PWVLHSQLKPTAATAQDQWTSQQ YPDH 
PTRLI LQ * NQATADKNN* TTALLQPHQRL\ VS PRMAEA 


5917 


1343 


827 


AHQILTYLEP/ ICLWNYNKILTVFLTKSVLBI * KFIHTPQTYR 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, 7=731106, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








F*NDFFGlKBVYVSRRLkKTSF/RLAVTFLEQAWSKECVPVDQ 
FMEHLLPSLLSLASDPVPNVRVLLAKALRQMLLEKAYFRNAGNP 
HLEVIEETILALQSDRDQDVSFFAALEPKRRNIIDTAVLEKQN 


5918 


13 


1247 


EGAQVARRRSRRQWRAGRCGRGRGGRRAERTGGRGPPGRPRPLP 
PGPARRGRRRMETPFYGDEALSGLGGGASGSGGTFASPGRLFPG 
APPTAAAGSMMKKDALTLSLSEQVAAALKPAPAPASYPPA\ADG 
APSAAP PDGLLASPDLGLLKIAS PELERLI IQSNGLVTTTPTSS 
QFLYPKVAASEEQEFAEGFVKALEDLHKQNQLGAGRAAAAAAAA 
AGGPSGTATGSAPPGELAPAAAAPEAPVYA\NLSSY\AGGCRGL 
RGGAAT\VAFAAEPVPFPPPPPPGALGPRRP/RLALQGRRPQTV 
PDVP\SFGESP\PLSPIET\DTPRRI\KAKRKRL\RNPQIRAPK 
PASRKLGAQSRALERESEDPS * S PEHGSLASTASLLREQVAQLK 
QKVLSHVNSGCQLLPQHQVPAY 


5319 


1 


4254 


TS VQGDSQGTPTSSQGS INMEHW I SQAIHGSTTSTTSSSSTQSG ' " 

GSGAAHRLADVMAQTHIENHSAPPDVTTYTSEHSIQVERPQGST 

G S RT AP K YGN AELME TGDG V P VS S R VS AK I QQL VNTL KR P KR P P 

LREFFVDDFEELLEVQQPDPNQPKPEGAQMLAMRGEQLGWTNW 

PPSLEAALQRWGTISPKAPCLTTMDTNGKPLYILTYGKLWTRSM 

KVAYSILHKLGTKQEPMVRPGDRVALVFPNNDPAAFMAAFYGCL 

LAEWPVPIEVPLTRKDAGSQQIGFLLGSCGVTVALTSDACHKG 

LPKSPTGEIPQFKGWPKLLWFVTESKHLSKPPRDWF\PHIKDAN 

NDTAYIEYKTCK\DGSVLGVTVTRTALLTHCQALTQACGYTEAE 

TIVNVLDFKKDVGLWHGILTSVMNMMHVISIPYSLMKVNPLSWI 

QKVCQ YKAKVACVKS RDMHWALVAHRDQRD I NLS SLRML I VADG 

ANPWSISSCDAFLNVFQSKGLRQEVICPCASSPEALTVAIRRPT 

DDSNQPPGRGVLSMHGLTYGVIRVDSEEKLSVLTVQDVGLVMPG 

AIMCSVKPDGVPQLCRTDEIGELCVCAVATGTSYYGLSGMTKNT 

FEVFAMTSSGAPISEYPFIRTGLLGFVGPGGLVFWGKMDGLMV 

VS GR RHNADD I VATALA VE PMKF VYRGR I AVFS VTVLHD E R I VI 

VAEQRPDSTEEDS FQWMSRVLQAI DS I HQVGVYCLALVPANTLP 

KTPLGGIHLSETKQLFLEGSLHPCNVLMCPHTCVTNLPKPRQKQ 

P E I G P AS VM VGNL VSG KR I AQ ASGRDLGQ I E DNDQAR KFL FLSE 

VLQWRAQTTPDHILYTLLNCRGAIANSLTCVQLHKRAEKIAVML 

ME RGHLQDGDH VAL V Y P PG I DL I AAFYGC L YAG CVP I T VR PPH P 

QNIATTLPTVKMIVEVSRSACLMTTQLICKLLRSREAAAAVDVR 

TWPLILDTDD*PKKRPAQICKPCNPDTLAYIiDFSVSTTGMLAGV 

KMS HAATSAFCRS I KLQCEL YPSREVAI CLDP YCGLGFVLWCLC 

SVYSGHQSILIPPSELETNPALWLLAVSQYKVRDTFCSYSVMEL 

CTKGLGS QTES L KARGLDL S RVRTC VWAEE RPR I ALTQS F S KX> 

FKDLGLHPRAVSTSFGCRVNLAICLQGTSGPDPTTVYVDMRALR 

HDRVRLVERGSPHSLPLMESGKILPGVRIIIANPETKGPLGDSH 

LGEIWVHSAHNASGYFTIYGDESLQSDHFNSRLSFGDTQTIWAR 

TGYLGFLRRTELTDANGERHDALYWGALDEAMELRGMRYHPID 

IETSVIRAHKSVTECAVFTWTNLLWWELDGSEQEALDLVPLV 

TNWLEEH YLI VGVWWDIGVI P INSRGEKQJRMHLRDGFLADQ 

LDPIYVAYNM 


5920 


1381 


1499 


QLGAVAHAGVSRIPP*LFPPLHPTFLSLWCLHHKLP/HPPGASM 
VRPPWPRRPPAHISSVRQASTQVPRTVPHTQRVANIGTQTTGP 
SGVGCCTPGRPLLPCKCSSAAHSTYRVQEPAVHIPGQEPLTASM 
IiAAAPLHEQKQMIGERLYPLIHDVHTQIiAGKITGMLLEIDNSEL 
LLMLE S PESLHAK I DE AVAVLQ AHQ AM E Q P KAYMH 


5921 
5922 


"7 9 7 

2475 


157 
495 


VCPGTGGE*GLWGQLGGLPKETPLKPMDAFTGSGLKRKFDDVDV 
GSSVSNSDDEISSSDSADSCDSLNPPTTASFTPTSILKRQKQLR 
RKNVRFDQVTVYYFARRQGFTSVPSQGGSSLGMAQRHNSVRSY.T 
LCEFAQEQEVNHREILREHLKEEKLHAKKMKLTKNGTVESVEAD 
GLTLDD VS DEDIDVENVEVDD YFFLQPL PTKRRRALLRASGVHR 
IDAEEKQELRAIRLSREECGCDCRLYCDPEACACSQAGIKCQVD 
RMS FPCGCS RDGCGNMAGR IE FNP I RVRTH YLHT I MKLELES KR 
Q\GAAQQPQ\*GALPDCQLQPDRSTGL+DPSWIGSKGLSFTGKG 
AAATHLI ILRVIENRGAEGKRK 






SYSNWGLFPSVFIQVPRSRTGNLKPIFLFYSYYE\CMErLKG\T | 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ovarii a/4 BV * J 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid, Fa Phenyl alanine, G=Glycine, 
H«Histidine, I»Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V*Valine, 
WtrTryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








CLYNATQYKVCSPRNDRPDACYNPSEPAATTVFEIRTGLLLGDT 
SKI ITRTEEKEI PKQI TLRFDACAAINS KKLEIGCGSLN * ERS * 
RVENKYVCHESG VCKNCAYWPCVI * AT* KKNKNDSVYLQKGEAN 
PSCAAGHCNPLELIITNPLDPHWKKGERVTLGINRTGLKPQWI 

L I KGEVHKCS p kp vpqtfyeelnlpapellkktknlflqlaenv 

I FLLNGTS C YVRGGTT I GDRWPWEA* ELVPTDPAPD 1 1 P I * KAE 
ASNF*VLKTSIIRQYCIAREGKDFIIPVGKPNCIGQKLYNSTTK 
TIT**DLNHTEKNPFSKFSKLKTA*AHAESH*DWTVPSGLY*IC 
RHRAYFRLPNKWADSCVIGTIKPSFFLLPIKMGELLGFSVYASR 
fc, KKG 1 V I GN W KDNEW P RE R 1 1 Q Y YG P ATWAQDG S WG YR / T P / VY 
MLNWI IRLQAILEI ISNETGRALTVLAWQETQMRNAI YQNRLAL 
D YLLVAEGG VCRKFNLTNCCLQ I NDQGQWKNI VRDMTKLAHVP 
IQVWHKFDPESLFGKWFPAIGGFKTLIVGVLLVIRTCLLLPCVL 
PLLFQMIKGIVATLVHQKTSAHVNYMNHYRSISQRDSKSEDESE 
NSH 


5923 


137 


638 


QLCGRRGQRFRTSIKRMHPI*RTCPNT^t/lILLS^EN ; TOlRDL " 
QQENRE LW I S LEEHQDALEL I MS KY R KQMLQLM VAKKAVDAE P V 
LKAHQSHSAEIESQIDRICEMGEVMRKAVQVDDDQFCKIQEKLA 
QLELENKELRELLSISSESLQARKENSMDTASQAIK 


5924 


274 


2146 


EKGKVKDAGAEQWISLSLSCKGSWETQFSNHLNSLTPPTSVRRM 
PL I TT VTLLKM VARHHM KLL CS KAFS TQ LQQ K I FLHS QMG I HHQ 
SVCMKLKPNTSHIISILMGQPMALVQLETLAPLTIIIQKFQTQD 
HMKFWKNLPLHSHHLTPSVPQTVIPKKTGSPEIKLKITKTIQNG 
RELFESSLCGDLLNEVQASE\Q*NQSIESRKEKRKKSNKHDSSR 
SEERKSHKIPKLEPEEQNRPNERVDTVSEKPREEPVLKEGSPSS 
ANTIFCSNNGSVHW\FKFQVGDLVWSKVGTYPWWPCMVSSDPQL 
EVHTKINTRGAREYHVQFFSNQPERAWVHEKRVREYKGHKQYEE 
LLAEATKQASNHSEKQKIRKPRPQRERAQWDIGIAHAEKALKMT 
REER I EQ YTF I Y I D KQPE EAL SQ AKKS VAS KTE VKKTRR PRS VL 
NTQPEQTNAGEVASSLSSTEIRRHSQRRHTSAEEEEPPPVKIAW 
KTAAAR KS L PAS I TMHKGSLDLQKCNMS P WKIEQVFALQNATG 
DG KF I DQ FVYS TKG IGKTKTE I S VRGQDRL 1 1 ST PNQRNEKPTQS 
VS S PE ATSG S TGS VE KKQQRR S I RTRS ES E KSTE WP KKKI KKE 
QVGFLHVES 


5925 


216 


1911 


MM TAE S RE ATGLS P QAAQE KDG I VI VKVEE EDEEDHM WGQDS TL " 

QDTPPPDPEIFRQRFRRFCYQNTFGPREALSRLKELCHQWLRPE 

INTKEQILELLVLEQFLSILPKELQVWLQEYRPDSGEEAVTLLE 

DLELDLSGQQVPGQVHGPEMLARGMVPLDPVQESSSFDLHHEAT 

QSHFKHSSRKPRLLQSRALPAAHIPAPPHEGSPRDQAMASALFT 

ADSQAMVKI EDMAVS L I LEEWGCQNLARRNLS RDNRQENYGS AF 

PQGGENRNENEESTSKAETSEDSASRGETTGRSQKEFGEKRDQE 

GKTGERQQKNPEEKTRKEKRDSGPAIGKDKKTITGERGPREKGK 

GLGRSFSLSSNFTTPEEVPTGTKSHRCDECGKCFTRSSSLIRHK 

I IHTGEKPYECSECGKAF\SLNS \NLVLHQRI\HTGEKPHECNE 

CGKAFSHSSNLILHQRIHSGEKPYECNECGKAFSQSSD\LTKHQ 

RIHTGEKPYECSECGKAFNRNSYLILHRRVHTREKPYKCTKCGK 

\AFTRSSTLTLHHRIHARERASEYSPASLDAFGAFLKSCV 


5926 


2 


233 


DRCLMLKQGSQPGSP PAT / CE P P AP P VYQ AP CQS CP E P PGAHE P " 
SDSPHHTPVHPPPEHSAACPAPATCCPPPRSSMS 


5927 


414$ 


1248 


KHFSKFGSQALYQLKRPASGQNSISVMPAQKITKPAAKYGIPLA~ 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 
i\KKJ_»ci c xct iva K.A.QKDQ 1 1 SLMKAEQM KRQS KE RLER I NRARE QG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAIFDQ 
MQQQRAEDNEAKWKRE I YGRGL PERQKGQLAVERAKQVEEFLQR 
KREAMQNKARAEGHMG I LQNLAAM YGGRPS SS RGGKPRNKEE EV 
YLARLRQIRLQNFNERQQIKAKLRGEKKEANHSEGQEGSEEADM 
RRKK\IESLKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV 
AKG VKS S D VS P P LGQHE TGG S PS KQQMRS VI S VTSAL KEVGVDS 
SLTDTRETSEEMQKTNNAISSKREILRRLNENLKAQEDEKGKQN 
LSDTFEINVHEDAKEHBKEKSVSSDRKKWEAGGQLVIPLDELTL 
DTS FSTTERHTVGEVI KLGPNGS PRRAWGKS PTDS VLKI LGEAE 



402 



WO 01/53312 



PCT/US00/34263 



CPA 

bay 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cyeteine, D=Aspartic Acid, B=» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S:=Serine, T=Threonine , V«Valine, j 
W^Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQLQTELLENTTIRSEISPEGEKYKPLIT^EKKVQCISHEINPS 
AIVDSPVETKSPEFSEASPQMSLKLEGNLEEPDDLETEILQEPS 
GTNKDE\SLPCTITDVWrSEEKETKETQSADRITIQENEVSEDG 
VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPFFHKWHSE 
HLNLVPQVQS VQ CS PEE S FAFRS HS HLP P KNKNKNS LL I GLS TG 
LFDANNPKMLRTCSLPDLSKLFRTLMDVPTVGDVRQDNLEIDEI 
EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 
E E E ES VLKNS DVE PTANGTDVADE DDNPS S ES ALNEE WHS DNS D 
GEIASECECDSVFNHLEELRLHLEQEMGFEKFFEVYEKIKAIHE 
DEDENI E I CSKI VQNI LGNEHQHL YAKI LHLVMADGAYQEDNDE 


5928 


4146 


1248 


KHFSKFGSQALYQLKRPASGQNSISVMPAQKITKPAAKYGIPLA 
YKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAAR 
KRRLEFIEKEKKQKDQIISLMKAEQMKRQEKERLERINRAREQG 
WRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAIFDQ 
MQQQRAEDNEAKWKREIYGRGLPERQKGQLAVERAKQVEEFLQR 
KREAMQNKARAEGHMGILQNLAAMYGGRPSSSRGGKPRNKEEEV 
YLAR LRQIRLQN FNERQQ I KAKLRGEKKEANHS EGQEGS EEADM 
RRKK\ I ESLKAHANARAAVLKEQLERKRKEAYEREKKVWEEHLV 
AKGVKS SDVS PPLGQHETGGS PSKQQMRSVI S VTSALKEVGVDS 
SLTDTRETSEEMQKTNNAISSKREILRRLNENLKAQEDEKGKQN 
LSDTFEINVHEDAKEHEKEKSVSSDRKKWEAGGQLVIPLDELTL 
DTSFSTTERHTVGEVIKLGPNGSPRRAWGKSPTDSVLKILGEAE 
LQLQTELLENTTIRSEISPEGEKYKPLITGEKKVQCISHEINPS 
AIVDSPVETKSPEFSEASPQMSLKLEGNLEEPDDLETEILQEPS 
GTNKDE\SLPCTITDVWISEEKETKETQSADRITIQENEVSEDG 
VSSTVDQLSDIHIEPGTNDSQHSKCDVDKSVQPEPFFHKWHSE 
HLNLVPQVQSVQCSPEESFAFRSHSHLPPKNKNKNSLLIGLSTG 
LFDANNPKMLRTCSLPDLSKLFRTLMDVPTVGDVRQDNLEIDEI 
EDENIKEGPSDSEDIVFEETDTDLQELQASMEQLLREQPGEEYS 
EEEESVLKNSDVEPTANGTDVADEDDNPSSESALNEEWHSDNSD 
GEIASECECDSVFNHLEELRLHLEQEMGFEKFFEVYEKIKAIHE 
DEDEN I E I CSK I VQNIIjGNEHQHLYAKI LHLVMADGAYQEDNDE 


5929 


3 


1558 


LDFSMTTQLPAYVAILLFYVSRASCQDTFTAAVYEHAAILPNAT 
LTPVSREEALALMNRNLDILEGAITSAADQGAHIIVTPEDAIYG 
WNFNRDSLYPYLEDI PDPEVNWI PCNNRNRFGQTPVQERLSCL\ 
AKNNSIYWANIGDKKPCDTSDPQCPPDGRYQYNTDWF\DSQG 
KLVARYHKQNLFMGENQFNVPKEPEI VTFNTTFGS FGI FTCFDI 
L FHDPAVTL VKD FHVDT I VF PTAWMNVL PHLS A V E FH S AWAMGM 
RVNFLASNIHYPSKKMTGSGIYAPNSSRAFHYDMKTEEGKLLLS 
QLDSHPSHSAWNWTSYASSIEALSSGNKEFKGTVFFDEFTFVK 
LTGVAGNYTVCQKDLCCHLSYIQMSENIPNEVYALGAFDGLHTVE 
GRYYLQICTLLKCKTTNLNTCGDSAETASTRFEMFSLSGTFGTQ 
YVFPEVLLSENQLAPGEFQVSTDGRLFSLKPTSGPVLTVTLFGR 
LYEKDWASNASSGLTAQARI IMLIVIAPIVCSLSW 


5930 


113 


6082 


RGNCFWIVPFTMAQRTGLEDPERYLFVDRAVIYNPATQADWTAK" 
KLVW I PS ERHGFEAAS I KEERGDE VMVE LAENGKKAMVNKDD I Q 
KMNPPKFSKVEDMAELTCLNEASVLHNLKDRYYSGLIYTYSGLF 
CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISESAYRCM 
LQDREDQS I LCTGESGAGKTENTKKVIQ YLAH VASSHKGRKDHN 
IPGE\LERQLLQANPILESFGNARTVQNDNSSRFGKFIRINFDV 
TGYI VGANI ETYLLEKSRAVRQAKDERTFHI FYQLLSG \AGEHL 
KSDLLLEGFNNYRFLSNGYIPIPGQ\QDKGNFRGDPGEAMHIMG 
FSHEEILSMLKWSSVLQFGNISFKKERNTDQASMPENTVAQKL 
CHLLGMNVME FTRAI LT PR I KVGRD Y VQKAQTKEQADFAVEALA 
KATYERLFRWLVHRINKALDRTKRQGASFIGILDIAGFEIFELN 
SFEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 
D LQ PCI DL I ER p AN P PG VLALLD E E CW FP KATDKTF VE KLVQEQ 
GSHSKFQKPRQLKDKADFCIIHYAGKVDYKADEWLMKNMDPLND 
NVATLLHQSSDRFVAELWKDVDRIVGLDQVTGMTETAFGSAYKT 
KKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCIIPNHEKRAGK 
LDPHLVLDQLRCNGVLEGIRICRQGFPNRIVFQEFRQRYEILTP 



403 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=s Leucine, M=Methionine, N=»Asparagine , 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NAIPKGFMDGKQACERMIRALELDPNLYRIGQSKIFFRAGVLAH 

LEEERDLKITDI II FFQAVCRGYLARKAFAKKQQQLSALKVLQR 

NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEEELQAKDEELLKVK 

EKQTKVEGELEEMERKHQQLLEEKNILAEQLQAETELFAEAEEM 

RARLAAKKQE L E E I LHDLE S R VE E E EERNQ I LQNE KKKMQAH I Q 

DLEEQLDEEEGARQKLQLEKVTAEAKIKKMEEEILLLEDQNSKF 

IKEKKLMEDRIAECSSQLAEEEEKAKNLAKIRNKQEVMISDLEE 

RLKKEEKTRQELEKAKRKLDGETTDLQDQIAELQAQIDELKLQL 

AKKEEELQGALARGDDETLHKNNALKWRELQAQIAELQEDFES 

EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 

QEVAEIiKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 

FKANLEKNKQGLETDNKELACEVKVLQQVKAESEHKRKKLDAQV 

QELHAKVSEGDRLRVE LAE KAS KLQNE LDNVS TLLEE AE KKGI K 

FAKDAASLESQLQDTQELLQEETRQKLNLSSRIRQLEEEKNSLQ 

EQQEEEEEARKNLEKQVLALQSQLADTKKKVDDDLGTIESLEEA " 

KKKLLKDAEALSQRLEEKALAYDKLEKTKNRLQQELDDLTVDLD 

HQRQVASNLEKKQ\KKFDQLLAEEKS I SARYAEERDRAEAEARE 

KETKALSLARALEEALEAKEEFERQNKQLRADMEDLMSSKDDVG 

KNVHELEKSKRALEQQV\EEMRTQLEELEDELQATEDAKLRLEV 

NMQAMKAQFERDLQTRDEQNEEKKRLLIKQVRELEAELEDERKQ 

RALAVASKKKME IDLKDLEAQIEAANKARDEVI KQLRKLQAQMK 

DYQRELEEARASRDEIFAQSKESEKKLKSLEAEILQLQEELASS 

ERARRHAEQERDELADEITNSASGKSALLDEKRRLEARIAQLEE 

SLEEEQSNMELLNDR FRKTTLQVDTLNAELAAERS AAQKS DNAR 

QQLERQNKELKAKLQELEGAVKSKFKATISALEAXIGQLEEQLE 

QEAKERAAANKLVRRTEKKLKEIFMQVEDERRHADQYKEQMEKA 

NARMKQLKRQLEEAEEEATRANASRRKLQRELDDATEANEGLSR 

EVSTLKNRLRRGGPISFSSSRSGRRQLHLEGASLELSDDDTESK 

TSDVNETQPPQSE 


5931 


113 


6082 


RGNCFWIVPFTMAQRTGLEDPERYLFVDRAVIYNPATQADWTAK' 

KLVWIPSERHGFEAASIKEERGDEVMVELAENGKKAMVNKDDIQ 

KMNPPKFSKVEDMAELTCLNEASVLHNLKDRYYSGLIYTYSGLF 

CWINPYKNLPIYSENIIEMYRGKKRHEMPPHIYAISE9AYRCM 

LQDREDQSILCTGESGAGKTENTKKVIQYLAHVASSHKGRKDHN 

I PGE \ LERQLLQANP I LESFGNARTVQNDNSSRFGKF I R INFD V 

TG Y IVGAN I ETYLLEKSRAVRQAKDERT FH I FYQLLSG\ AGEHL 

KS DLLLEG FNN YR FLS NG Y I P I PGQ \QD KGN FRG D PG E AMH I MG 

FSHEEILSMLKWSSVLQFGNISFKKERNTDQASMPENTVAQKL 

CH LLGMNVME FTRA I LT P R I KVGRD YVQKAQTKEQAD FAVEALA 

KATYERLFRWLVHRINKALDRTKRQGAS FIGILDIAGFE I FELN 

SFEQLCINYTNEKLQQLFNHTMFILEQEEYQREGIEWNFIDFGL 

DLQPCIDLIERPANPPGVLALLDEECWFPKATDKTFVEKLVQEQ 

GSHSKFQKPRQLKDKADFCIIHYAGKVDYKADEWLMKNMDPLND 

NVATLLHQS SDRFVAELWKDVDR I VGLDQ VTGMTETAFGS AYKT 

KKGMFRTVGQLYKESLTKLMATLRNTNPNFVRCIIPNHEKRAGK 

LDPHLVLDQLRCNGVLEGIRICRQGFPNRIVFQEFRQRYEILTP 

NA I P KG FMD G KQ ACE RM I RALE LD PNL YR I GQS KI F FRAG VLAH 

LEEERDLKI TDI 1 1 FFQAVCRGYLARKAFAKKQQQLSALKVLQR 

NCAAYLKLRHWQWWRVFTKVKPLLQVTRQEEELQAKDEELLKVK 

EKQTKVEGELEEMERKHQQLLEEKNILAEQLQAETELFAEAEEM 

RARLAAKKQELEEILHDLESRVEEEEERNQILQNEKKKMQAHIQ 

DLEEQLDEEEG ARQKLQLE KVTAEAKI KKME EE I LLLEDQNSKF 

IKEKKLMEDRIAECSSQLAEEEEKAKNLAKIRNKQEVMISDLEE 

RL KKE E KTRQE L EKAKRKLDGETTDLQDQ I AE LQAQ I DE LKLQL 

AKKEEELQGALARGDDBTLHKNNALKWRELQAQIAELQEDFES 

EKASRNKAEKQKRDLSEELEALKTELEDTLDTTAAQQELRTKRE 

QEVAELKKALEEETKNHEAQIQDMRQRHATALEELSEQLEQAKR 

FKANLEKNKQGLETDNKELACEVKVLQQVKAESEHKRKKLDAQV 

QELHAKVSEGDRLRVELAE KAS KLQNELDNVS TLLEEAEKKG I K 

FAKD AAS LES QLQDTQE LLQE ETRQ KLNLS SRI RQLEEE KNS LQ 

EQQEEEEEARKNLEKQVLALQSQLADTKKKVDDDLGTIESLEEA 



404 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AaAlanine, C=Cysteine, D=Aspartic Acid, E-» 
Glutamic Acid, F=Phenylalanine, G-Glycine, 
H=Histidine, I«Isoleucine, K»Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KKKLLKDAEALSQRLEEKALAYDKLEKTKNRLQQELDDLTVDLD 
HQRQ VAS NLE KKQ \ KKFDQLLAE E KS I SARYAEERDRAEAEARE 

ketkalslaraleealeakeeferqnkqlradmedlmsskddvg 
knvhelekskraleqqv\eemrtqleeledelqatedaklrlev 
nmqamkaqferdlqtrdeqneekkrllikqvreleaelederkq 

RALAVASKKKMEIDLKDLEAQIEAANKARDEVIKQLRKLQAQMK 

dyqreleearasrdeifaqskesekklksleaeilqlqeelass 
erarrhaeqerdeladeitnsasgksalldekrrleariaqlee 
eleeeqsnmellndrfrkttlqvdtlnaelaaersaaqksdnar 
qqlerqnkelkaklqelegavkskfkatisaleakigqiieeqle 
qeakeraaanklvrrtekklkeifmqvederrhadqykeqmeka 
narmkqlkrqleeaeeeatranasrrklqrelddateaneglsr 
evstlknrlrrggp isfs s srsgrrqlhlegaslelsdddtes k 

TSDVNETQPPQSE 


5932 


33 


572 


RHLEEICFLFLQKGRKLKLSGPRWEEGKPRGTGGLWVKAEANMG 
FGATLAVGLTI FVLS WTI I ICFTCSCCCLYKTCRRPRPV\APP 
PHPP/PVVHAPYPQPPSVPPSYPGPSYQGYHTMPPQPGMPAAPY 
PMQYPPPYPAQPMGPPAYHETLAGGAAAPYPASOPPYNPAYMDA 
PKAAL 


5933 


1 


3190 


GTRKLKMADKTPGGSQKASSKTRSSDVHSSCjSSDAHMDASGPSD 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFSIGKMSTAKRTLSKKEQEELKKKEDEKAAAEIYEEFLAAFEG 
SDGNKVKTFVRGGWNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEBRDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NFYLGNI\NPQMNLKKCCCQEFGRFGP 
L^VKIMWPRTDEERARERNCGFVAFMNRRDAERALKNLNGKMI 
MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 
RERLKNPNAPMLPPPKNKEDFEKTLSQAIVKWIPTERNLLALI 
HRMIEFWREGPMFEAMIMNREINNPMFRFLFENQTPAHVYYRW 
KLYSILQGDS PTKWRTEDFRMFKNGSFWRPPPLNPYLHGMSEEQ 
ETEAFVEEPSKKGALKEEQRDKLEEILRGLTPRKNDIGDAMVFC 
LNNAE AAEE IVDCITESLSI LKT PL P KKI AR L YLVS DVL YNS S A 
KVANASYYRKFFETKLCQIFSDLNATYRTIQGHLQSENFKQRVM 
TCFRAWEDWAIYPEPFLIKLQNIFLGLVNIIEEKETEDVPDDLD 
GAPIEEELDGAPLEDVDGIPIDATPIDDLDGVPIKSLDDDLDGV 
PLDATEDS KKNE P I FKVAPS KWEAVDES ELEAQAVTTSKWELFD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHHLYSNPIKEEMTE 
S KFS KYSEMSEEKRAKLRE I ELKVMKFQDELESGKRPKKPGQSF 
QEQVEHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 
DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 
SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


5934 


1 


3190 


GTRKL KMADKT PGG S QKAS S KTRS S D VHS S GS S DAHMDASG P S D 
SDMPSRTRPKSPRKHNYRNESARESLCDSPHQNLSRPLLENKLK 
AFS IGKMSTAKRTLS KKEQEELKKKEDEKAAAE I YEE FLAAFEG 
SDGNKVKTFVRGGWNAAKEEHETDEKRGKIYKPSSRFADQKNP 
PNQSSNERPPSLLVIETKKPPLKKGEKEKKKSNLELFKEELKQI 
QEERDERHKTKGRLSRFEPPQSDSDGQRRSMDAPSRRNRSSGVL 
DDYAPGSHDVGDPSTT\NFYLGNI\NPQMNLKKCCCQEFGRFGP 
LAS VK I MWPRTDE ERARERNCGFVAFMNRRDAERALKNLNGKM I 
MSFEMKLGWGKAVPIPPHPIYIPPSMMEHTLPPPPSGLPFNAQP 
RERLKNPNAPMLPPPKNKEDFEKTLSQAIVKWIPTERNLLALI 
HRMIEFWREGPMFEAMIMNREINNPMFRFLFENQTPAHVYYRW 
KLYSILQGDSPTKWRTEDFRMFKNGSFWRPPPLNPYLHGMSEEQ 
ETEAFVEEPSKKGALKEEQRDKLEEILRGLTPRKNDIGDAMVFC 
LNNAEAAEE I VDCI TESLS I LKT P LP KK I ARLYLVSDVL YNSSA 
KVANASYYRKFFETKLCQIFSDLNATYRTIQGHLQSENFKQRVM 
T CFRAWE DWAI YPE P FLI KLQNI FLGL VN 1 1 EEKETEDVPDDLD 
GAPIEEELDGAPLEDVDGIPIDATPIDDLDGVPIKSLDDDLDGV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Ieoleucine, K=Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q»Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLDATEDSKKNEPIFKVAPSKWEAVDESELEAQAVTTSKWELFD 
QHEESEEEENQNQEEESEDEEDTQSSKSEEHHLYSNPIKEEMTE 
SKFSKYSEMSEE KRAKLRE I ELKVMKFQDE LE S GKR PKKPGQS F 
QEQVEHYRDKLLQREKEKELERERERDKKDKEKLESRSKDKKEK 
DECTPTRKERKRRHSTSPSPSRSSSGRRVKSPSPKSERSERSER 
SHKESSRSRSSHKDSPRDVSKKAKRSPSGSRTPKRSRRSRSRSP 
KKSGKKSRSQSRSPHRSHKKSKGKTNTGRKFFKKAVTYWKCDLF 
LCPERSVF 


5935 


3 


4493 


SYWLSGWRLSRP PRQ FWAGWRG I GR FGTMAP VHGDD CE I G AS AL 

SDSGSFVSSRARREKKSKKGRQEALERLKKAKAGERYKYEVEDF 

TGVYEE VDEEQ YS KLVQARQDDDW I VDDDG IG YVEDGRE I FDDD 

LEDDALDADEKGKDGKARNKDKRIJVKKLAVTKPNNIKSMFIACA 

GKKTADKAVDLSKDGLLGDILQDLNTETPQITPPPVMILKKKRS 

IGASPNPFSVHTATAVPSGKIASPVSRKEPPLTPVPLKRAEFAG 

DDVQVESTEEEQESGAMEFEDGDFDEPMEVEEVDLEPMAAKAWD 

KESEPAEEVKQEADSGKGTVSYLGSFLPDVSCWDIDQEGDSSFS 

VQEVQVDSSHLPLVKGADEEQVFHFYWLDAYEDQYNQPGWFLF 

GKVWIESAETHVSCCVMVKNIERTLYFLPREMKIDLNTGKETGT 

P I SMKDVYEE FDE KI ATKYKIMKFKS KPVE KNYAFE I PDV PEKS 

EYLEVKYSAEMPQLPQDLKGETFSHVFGTNTSSIiELFLMNRKIK 

G PCWLE VKKS TALNQPVS WCKVEAMAL KPDLVNV I KDVS P P PLV 

VMA FS MKTMQNAKNHQNE 1 I AMAALVHHS FALDKAAPKP P FQ SH 

FCWSKPKDCIFPYAFKEVIEKKNVKVEVAATERTLLGFFLAKV 

HKIDPDIIVGHNIYGFELEVLLQRINVCKAPHWSKIGRLKRSNM 

P KLGG RS G FGERNAT CGRM I CD VE I SAKE L I R CKS YHLS ELVQQ 

ILKTERWIPMENIQNMYSESSQLLYLtiEHTWKDA\KFILQIMC 

ELNVL P LALQ I TN I AGN IMS RTLMGGRS ERNE FLLLHAF YE NN Y 

IVPDKQIFRKPQQKLGDEDEEIDGDTNKYKKGRKKGAYAGGLVL 

DPKVGFYDKFILLLDFNSLYPSIIQEFNICFTTVQRVASEAQKV 

TEDGEQEQIPELPDPSLEMGILPREIRKLVERRKQVKQLMKQQD 

tiNPDLILQYDIRQKALKLTANSMYGCLGFSYSRFYAKPLAALVT 

YKGREILMHTKEMVQKMNLEVIYGDTDSIMINTNSTNLEEVFKL 

GNKVKS E VNKLYKLLE I D I DGVFKSLLLLKKKKYAALWE PTS D 

GNYVTKQELKGLDIVRRDWCDLAKDTGNFVIGQILSDQSRDTIV 

ENIQKRLIEIGENVLNGSVPVSQFEINKALTKDPQDYPDKKSLP 

HVHVALWINSQGGRKVKAGDTVSYVICQDGSNLTASQRAYAPEQ 

LQKQDNLT I DTQ Y YLAQQ I H P WAR I CE P I DG I DAVL I ATGWE L 

\DPTQFKVHHYHKDEENDALLGGPAQLTDEEKYRDCERFKCPCP 

TCGTE N I YDNVFDGS GTDM EPS L YRCS N I DCKAS P LT FTVQL SN 

KL I MD I RR FI KKY YDGWL I CEE PTCRNRTRHLPLQ FSRTGPLCP 

ACMKATLQ P E Y S DKS L YTQLCF YRY I FDAE CALE KLTTDHE KDK 

LKKQFFTPKVLQDYRKLiKNTAEQFLSRSGYSEVNLSKIjFAGCAV 

K3 


5936 


1124 


139 


RGEEQFDAEFRRFACLGFGERLQEFSRLLRAVHRSRAWTCYLAI 
RMLMATCCPS PTTTACTG P WQR APPLRLL VQKREAD S SGLAFAS 
NSLQRRKKGLLLRPVAPLRTRPPLLISLPQDFRQVSSVIDVDLL 
P E THR R VRLHKHG S DR PLG FY I RDGMS VR VAPQG \ L ER VPG I F I 
S RLVRGGLAESTGLLAVSDE I LEVNG I E VAG KTLNQ VTDMMVAN 
SHN\LIVTVKPANQRNNWRGASGRLTGPPSAGPGPAEPDSDDD 
SSDLVIENRQPPSSNGLSQGPPCWDLHPGCRHPGTRSSLPSLDD 
QEQAS SGWGSRIRGDGSGFSL 


5937 


31 


1600 


PTSLLKSTVQLMCRLLQDKRYQCVYSLAEIFKVLASFYVILVIL 
YGLTSSYSLWWMLRSSLKQYSFEALREKSNYSDIPDVKNDFAFI 
LHLADQ YDP L YS KR FS I FLS E VS ENKLKQ I NLNNE WTVE KLKS K 
LVKNAQDKI ELHLFMLNGLPDNVFELTEMEVLSLEL I PEVKLPS 
AVSQLVNLKELRVYHSSLWDHPALAFLEENLKILRLKFTEMGK 
I P RW VFH LKKTLKE L YLSGCVL P EQLSTMQLEG FQDLKNLRTL YL 
KS S LSRI PQ WTDLLPS LQKLS LDNEGS KLWLNNLKKMVNLKS 
LEIjISCDLERI PHS I FSLNNLHELDLRENKLKTVEE IIS FQHLQ 
NLSCLKLWHNNIAYIPAQIGALSNLEQLSLDHNNIENLPLQLFL 
CTKLHYLDLSYNHLTFIPEEIQYL\SNLQYFAVTNNNIEMLPDG 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

location 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E* 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, MsMethionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T»Threonine, V»Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown f *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








lfqckklqclllgknslmnlsphvgelsnlthrepig\nyletl " 
ppelegcqslkrncliveenllntlplpvterlqtcldkc 


5938 


39S 


1865 


YKGEGFFCNQEARGERRKKKKAMSSPNIWSTGSSVYSTPVFSQK 
MTVWILLLLSLYPGFTSQKSDDDYEDYASNKTWVLTPKVPEGDV 
TVILNNLLEGYDNKLRPDIGVKPTLIHTDMYVNSIGPVNAINME 
YTIDIFFAQTWYDRRLKFNSTIKVLRLNSNMVGKIWIPDTFFRN 
SKKADAHWITTPNRMLRIWNDGRVLYSLRLTIDAECQLQLHNFP 
MDEHSCPLEFSSYGYPREEIVYQWKRSSVEVGDTRSWRLYQFSF 
VGLRNTTEWKTTSGDYWMSVYFDLSRRMGYFTIQTYIPCTLI 
WLS WVS F W INKDAVPARTS LG I TTVLTMTTLST I ARKS LP KVS 
YVTAMDLFVSVCFIFVFSALVEYG\TLHYFVSNRKPSKDKDKKK 
KNPAPTIDIRPRSATIQMNNATHLQERDEEYGYECLDGKDCASF 
FCCFEDCRTGAWRHGRIH IRIAKMDS YARI FFPTAFCLFNLVYW 
VSYLYL 


5939 


66 


1404 


IRPGYLKEVQENSPGHRAGLEPFFDFIVSINGSRLNKDNDTLKD 
LLKANVEKPVKMLIYSSKTLELRETSVTPSNLWGGQGLLGVSIR 
FCS FDGANENVWHVLE VESNS PAALAGLRPHS D Y I IGADTVMNE 
S EDLFS L I ETHEAKP LKLYV YNTDTDNCREV 1 1 TPNS AWGGEGS 
LGCG I G YG YLHR I PTR P FEEGKK I S L PGQMAGTP I TPLKDGFTE 
VQLSSVNPPSLSPPGTTGIEQSLTGIjSISSTP\PAVSSVLSTGV 
PTVP\LLPPQVNQSLTSVPPMESSYLHLPGLMPFTRQGLPNLPQ 
PSTFNLPR\PTHSWPGVGLYQEFVKPGVLPPLSSMPPRNLPG\I 
APLPLPSEFLPSFPLVPESSSAASSGELLSSLPPTSNAPSDPAT 
TTAKADAAS s lt vdvtp ptakaptt VEDRVGDS tpvs EKPVS aa 
VDANASESP 


5940 


145 


717 


RRSASRSASPRQSAGTAVTTGTRAGGTCLAAAHHRMRWRADGRS 
LEKLPVHMGLVITEVEQEPSFSDIASLWWCMAVGISYISVYDH 
QG I F KRNNS R LMDE I LKQQQE LLGLD CS KYS PE FANSNDKDDQ V 
LNCHLAVKVLSPEDGKADIVRAAQDFCQLVAQKQKRPTDLDVDT 
LA\VYLVQMWLILI 


5941 


13 


6147 


mclgrmg as s prs p e pvg p p a pglp fc cgg sllavwllal p va ' 

wgqcnapew\lpfarptnltdefefpigtylnyecrpgysgrpf 

siiclknsvwtgakdrcrrkscrnppdpvngmvhvikgiqfgsq 

ikysctkgyrligsssatciisgdtviwdnetpicdripcglpp 

titngdfistnrenfhygswtyrcnpgsggrkvfeLvgepsiy 

c tsnddq vg i wsg p apq c 1 1 pnkct p pnveng i l vs dnrslf s l 

newefrcqpgfvmkgprrvkcqalnkwepelpscsrvc?qpppd 

vlhaertqrdkdnfs pgqevfysce pg ydlrgaasmrctpqgdw 

s p aapt ce vks cdd fmgq llng r vlf p vnlqlgakvd fvcdeg f 

qlkgssasycvlagmeslwnssvpvceqifcpsppvipngrhtg 

kple vfp fgkavn ytcdphpdrgts fdhgest i rctsdpqgng 

vwsspaprcgilghcqapdhflfaklktqtnasdfpigtslkye 

crpeyygrpfsitcldnlvwsspkdvckrkscktppdpvngmvh 

vxtdiqvgsrinyscttghrlighssaecilsgnaahwstkppi 

cqripcglpptiangdfistnrenfhygswtyrcnpgsggrkv 

felvge ps i yctsnddqvg i wsg papq c 1 1 pnkctp pnveng i l 

vsdnrslfslnewefrcqpgfvmkgprrvkcqalnkwepelps 

cs r vcq p p p d vlhae rtqrd kdn fs pgqe vf ys ce pg ydlrgaa 

smrctpqgdwspaaptcevkscddfmgqllngrvlfpvnlqlga 

kvdfvcdegfqlkgssasycvlagmeslwnssvpvceqifcpsp 

pvipngrhtgkplevfpfgkavnytcdphpdrgtsfdligesti 

ttx. J. bUPQGNGVWSSPAPRCGILGHCQAPDHFLFAKLKTQTNASD 
FPIGTSLKYECRPEYYGRPFSITCLDNLVWSSPKDVCKRKSCKT 
PPDPVNGMVHVITDIQVGSRINYSCTTGHRLIGHSSAECILSGN 
TAHWSTKP P I CQR I P CGLPPT I ANGDF I STNRENFK YGS WTYR 
CNLGSRGRKVFELVGEPSIYCTSNDDQVGIWSGPAPQCIIPNKC 
T P PNVENG I L VS DNRS LFSLNE WE FRCQ PG FVMKG PRR VKCQ A 
LNKWEPELPSCSRVCQPPPEILHGEHTPSHQDNFSPGQEVFYSC 
EPGYDLRGAASLHCTPQGDWSPEAPRCAVKSCDDFLGQLPHGRV 
LFPLNLQLGAKVSFVCDEGFRLKGSSVSHCVLVGMRSLWNNSVP 
VCEH I F C PN P PA I LNGRHTGT PS GD I P YG KE I S YTCD PHPDRGM 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
cor re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


freaiccea ena. 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine f 
L=Leucine, M=Methionine , N=Asparagine i 
P=Proline, Q=Glutamine, RoArginine, 
S=Serine, T«Threonine, V=Valine, 
W»Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TFNLIGESTIRCTSDPHGNGVWSSPAPRCELSVRAGHCKTPEQF 
P FAS PTI P INDFEFPVGTSLNYECRPGYPGKMPS ISCLENLVWS 
S VE DNCRRKS CG P P PE P FNGMVH I NTDTQ FG S T VN Y S CNEG FRL 
IGSPSTTCLVSGNNVTWDKKAPICEIISCEPPPTISNGDFYSNN 
RTS FHNGTWTYQCHTGPDGEQLFELVGERS I YCTSKDDQVGVW 
SSPPPRCISTNKCTAPEVENAIRVPGNRSFFSLTBIIRFRCQPG 
FVMVGSHTVQCQTNGRWGPKLPHCSRVCX3PPPEIIiHGEHTLSHQ 
DNFS PG QE VF YS C EPS YDLRG A AS LH CTPQGD WS PEAPRCT VKS 
CDDFLGQLPHGRVLLPLNLQLGAKVSFVCDEGFRLKGRSASHCV 
LAGMKALWNSSVPVCEQIFCPNPPAILNGRHTGTPLGDIPYGKE 
VSYTCDPHPDRGMTFNLIGESTIRRTSEPHGNGVWSSPAPRCEL 
PVGAACPHPPKIQNGHYIGGHVSLYLPGMTISYTCDPGYLLVGK 
G F I F CTDQG I W S QLDH YC KE VN CS F P LFMNG I S KE LEMKKVYH Y 
GDYVTLKCEDGYTLEGSPWSQCQADDRWDPPLAKCTSRTHDALI 
VGTLSGTIFFIIiLIIFLSWIILKHRKGNNAHENPKEVAIHLHSQ 
GGSSVHPRTLQTNEENSRVLP 


5942 


4509 


688 


ylyvrmranplaygishkayqidppl\rkhreq\lvie\vgrkl""' 

DK\AQMIRFEERTGYFSSTDLGRTASHYYIKYNTIETFNELFDA 

HKTEGDIFAIVSKAEEFDQIKVREEEIEELDTLLSNFCELSTPG 

GVENSYGKINILLQTY1NRGEMDSFSLISDSAYVAQNAARIVRA 

LFEIALRKRWPTMTYRLLNTjSKAIDKRLWGWASPIjRQFSILPPH 

MLTRLEEKKLTVDKLKDMRKDEIGHILHHVNIGLKVKQCVHQIP 

S VMMEAFIQP I TRTVLRVTLS I YADFTWNDQVHGTVGEPWW I WV 

EDPTNDHIYHSEYFLALKKQVISKEAQLLVFTIPIFEPLPSQYY 

IRAVSDRWLGAEAVCIINFQHLILPERHPPHTELLDLQPLPITA 

LGCKAYEALYNFSHFNPVQTQIFHTLYHTDCNVLLGAPTGSGKT 

VAAELAIFRVFNKYPTSKAVYIAPLKALVRERMDDWKVRIEEKL 

GKKV I ELTGDVTPDMK3 I AKADttl VTTPE KWDGVS RS WQNRNYV 

QQVTILIIDEIHLLGEERGPVLEVIVSRTNFISSHTEKPVRIVG 

LSTALANARDLADWLNIKQMGLFNFRPSVRPVPLEVHIQGFPGQ 

HYCPRMASMNKPAFQAIRSHSPAKPVLIFVSSRRQTRLTALELI 

AFLATEEDPKQWLNMDEREMENIIATVRDSNLKLTLAEi'GIGMHH 

AGLHERDRKTVEELFVNCKVQVLIATSTLAWGVNFPAHLVIIKG 

TEYYDGKTRRYVDFPITDVLQMMGRAGRPQFDDQGKAVILVHDI 

KKD FYKKFLYEPF P VE S SLLG VLS DH LNAE I AGGT I TS KQDALD 

YITWTYFFRRLIMNPSYYNLGDVSHDSVNKFLSHLIEKSLIELE 

LS YCI E I GEDNRS I EPLTYGR I AS YYYLKHQTVKMFKDRLKPEC 

STEELLSILSDAEEYTDLPVRHNEDHMNSELAKCLPIESNPHSF 

DS PHTKAHLLLQAHLS RAML P CPDYDTDTKTVLDQALR VCQAML 

D VAANQG WL VTV LN I TNLI QM V I QGRWLKDS S LLTL PN I ENHHL 

HLFFCKWKPIMKGPHARGRTSIECLPELIHACGGKDHVFSSMVES 

ELHAAKTKQAWNFLSHLPEINVGISVKGSWDDLVEGHNELSVST 

LTADKRDDNKWIKLHADQEYVLQVSLQRVHFGFHKGKPESCAVT 

P R F P KS KDEGW FL I LG E VDKRE L I ALKR VG Y I RNHHVAS LS F YT 

PEIPGRYIYTLYFMSDCYLGLDQQYD/NLSQRYTSESFCTGQHQ 
GL 


5943 


1 


2274 


DKPTRHKTYLSSSWAKMAAAEGPVGDGELWQTWLPNHWFLRLR 
EGLKNQSPTEAEKPASSSLPSSPPPQLLTRNWFGLGGELFLWD 
GEDSSFLWRLRGPSGGG\EEPALSQYQRLLCINPPLFEIYQVL 
LSPTQHHVALIGIKGLMVLELPKRWGKNSEFEGGKSTVNCSTTP 
VAERFFTSSTSLTLKHAAWYPSEILDPHWLLTSDNVIRIYSLR 
* aim v j. j.iJOiirtODiJoijVi^tt.tjKAx InoLnjh i AVAr DFGPIiA 
AVPKTLFGQNGKDEWAYPLYILYENGETFLTYISLLHSPGN/I 
WKAVGS IAHAS \AAEDNYGYDACAVLCLPCVPNILVIATESGML 
YHCWLEGEEEDDHTSEKSWDSRIDLIPSLYVFECVELELALKL 
ASGEDDPFDSDFSCPVKLHRDPKCPSRYHCTHEAGVHSVGLTWI 
HKLHKFLGSDEEDKDSLQELSTEQKCFVEHILCTKPLPCRQPAP 
IRGFWIVPDILGPTMICITSTYECLIWPLLSTVHPASPPLLCTR 
EDVEVAESPLRVLAETPDSFEKHIRSILQRSVANPAFLKASEKD 
IAPPPEECLQLLSRATQVFREQYILKQDLAKEEIQRRVKLLCDQ 
KKKQLEDLSYCREERKSLREMAERLADKYEEAKEKQEDIMNRMK 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=*Aspartic Acid, E= 
Glutamic. Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine , K=Lysine, 
L^Leucine, M=Methionine, N«Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine # 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KLLHSFHSELPVLSDSERDMKKELQLIPDQLRHLGNAIKQVTMK 
KD YQQQ KMEKVLSLP KPT 1 1 LS AYQRKC IQS I LKEEGEH I REMV 
KQINDIRNHVNF 


5944 


167 


342B 


FSIATFTDEPEVLTEPPSATTTTTIGISATWTTLAGSHGKRNNT 
ITTTSSKRKNRKNKITPENVQIIFDDPLPISYSQPEKVNGESKS 
SSTSESGDSDNMRISSCSDESSNSNSSRKSDNHSPAWTTTVSS 
KKQPSVLVTFPKEERKSVSGKASIKLSETISEGTSNSLSTCTKS 
GPS PLS S PNG KLT VAS P KRGQKRE EG WKE WRRS KKVS VP S TV I 
S R V I GRGGCN I NA I R E FTG AH I DI D KQ KD KTGDR 1 1 T I RGGTE S 
TRQATQLINALIKDPDKEIDELIPKNRLKSSSANSKIGSSAPTT 
TAANTSLMGIKMTTVALSSTSQTATALTVPAISSASTHKTIKNP 
VN\NVRPGFPVSFP\LAYPPPQFAHALLAAQTFQQIRPPRLPMT 
HFGGT F P P AQ S T WG P FP VR PL S P ARATNS P KPHM VPRHS NQNS S 
GSQWSAGSLTSS PTTTTSSSASTVPGTSTNGS PSSPSVRRQLF 
VTWKTSNATTTTVTTTASNNNTAPTNATYPMPTAKEHYPVSSP 
SSPSPPAQPGGVSRNSPLDCGSASPNKVASSSEQEAGSPPWET 
TNTRPPNSSSSSGSSSAHSNQQQPPGSVSQEPRPPLQQSQVPPP 
EVRMTVP PLATS SAP VAVPS TAP VTYPMPQTPMGCPQPTPKMET 
PAIRPPPHGTTAPHKNSASVQNSSVAVLSVNHIKRPHSVPSSVQ 
LPSTLSTQSACQNSVHPANKPIAPNFSAPLPFGPFSTLFENSPT 
SAHAFWGGSWSSQSTPESMLSGKSSYLPNSDPLHQSDTSKAPG 
FRPPLQRPAPSPSGIVNMDSPYGSVTPSSTHLGNFASNISGGQM 
YGPGAPLGGAPAAANFNRQHFSPLSLLTPCSSASNDSSAQSVSS 
GVRAPSPAPSSVPLGSEKPSNVSQDRKVPVPIGTERSARIRQTG 
TSAPSVIGSNLSTSVGHSGIWSFEGIGGNQDKVDWCNPGMGNPM 
IHRPMSDPGVFSQHQAMERDSTGIVTPSGTFHQHVPAGYMDFPK 
VGGMPFSVYGNAMIPPVAPI PDGAGGPI FNGPHAADPS WNSLI K 
MVS SS TENNG PQTVWTG P WAPHMNS VHMNQLG 


5945 


1461 


197 


GVTHLFLFGKRKLRNGIAEDLKGQADFFFLLVSEAWATGSPRA 
WLTCLILPLPGIIFSVLPKAMSRPLLITFTPATDPSDLWKDGQQ 
QPQPEKPESTLDGAAARAFYEALIGDESSAPDSQRSQTEPARER 
KRKKRR I MKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRI LRAA 
QEGDL P BLRR LLE PHE AGG AGGN I NARDAF W WTPLM CAARAGQG 
AAVSYLLGRGAAWVGVCELSGRDAAQLAEEAGFPEVARMVRESH 
GETRSPENRSPTPSLQYCENCDTHFQDSNHRTSTAHLLSLSQGP 
QPPNLPLGVPISSPGFKLLLRGGWEPGMGLGPRGEGRANPIPTV 
LKRDQEGLGYRSAPQPRVTHFPAWDTRAVAGRE\TPPRVATLSW 
REERRREE\KDRAWERDLRTYMNLEF 


5946 


541 


1666 


ILGSYSSIQPEEYS\SWC\EWLQDLLA\YVSPK\HSYLRDLP 
SEGSPQRVNSIDFV\EL\EHLQPDVLVHAVLRWDF/TILTEAV 
YSYRGQKQKKVMLTVEQAQDQHYALVLWGPGAAW\YPQLQRKKG 
Y I WE FKYLFVQCNYTLENLELHTTP WSS CE CLFDDD I RAITFKA 
KFQKSAPSFVKISDLATHLEDKCSGWLIKAQISELAFPITASQ 
KIALNAHSSLKSIFSSLPNIVYTGCAKCGLELETDENRIYKQCF 
SCLPFTMKKIYYRPALMTAIDGRHDVCIRVESKLIEKILLNISA 
DC LNR VI VP S S E I T YGMWADL FHS L LAVS AEP C VLK I Q SL FVL 
DENSYPLQQDFSLLDFYPDIVKHGANARL 


5947 


3 


1317 


RG I PDRRRRGP IGRVNMDLENKVKKMGLGKEQGFGAPCLKCKEK 
CEGFELHFWRKICRNC\NVAKKSM/TVLLSNEEDRKVGKLFEDT 
KYTTLIAKLKSDGIPMYKRNVMILTNPVAAKKNVSINTVTYEWA 
P P VQNQALARQ YMQML PKEKQPVAGS EGAQYRKKQLAKQLPAHD 
QD P S KCHE LSPREVKEMEQ FVKK YKS EALG VGD VKLP CEMDAQG 
PKQMNIPGGDRSTPAAVGAMEDKSAEHKRTQYSCYCCKLSMKEG 
DPAI YAERAG YDKLWH PACFVCSTCHELLVDM I Y F WKNBKLYCG 
RHYCDSEKPRCAGCDELI FSNEYTQAENQNWHLKHFCCFDCDS I 
LAGE I YVMVNDKPVCKPCYVKNHAWCQGCHNAIDPEVQRVTYN 
NFSWHASTECFLCSCCSKCLIGQKFMPVEGMVFCSVECKKRMS 


5948 


39 


3370 


YRERYPVSGGSVLRSALEVCWDFLSGLTEGSLLPfeGFF^GPIDQ 
GNHYQMRRKGRCHRGSAARHPSSPCSVKHSPTRETLTYAQAQRM 
VE I E I EGRLHR ISIFDPLEII LEDDLTAQEMS E CNSNKENS ER P 
PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIVEY 
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rreaicceu 
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nucleotide 
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residue of 
amino acid 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L^Leucine, M«Methionine, N=Asparagine, 
P«Proline, Q=Glut amine, R=Arginine, 1 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








SPPSAPRRPPVYYKFIEKSAEELDNEVEYDMDEEDYAWLEIVNE~~~ 

KRKGDCVPAVSQSMFEFLMDRFEKESHCENQKQGEQQSLIDEDA 

VCCI CMDGECQNSNVI LFCDMCNLAVHQECYGVP Y I PEGQWLC / 

RAHCLQS RARPADC VLCPNKGGAFKKTDDDRWGHV\ VCALW \ I P 

E \ VG FANTVF I E P I DG VRNI P PARWKLT \ CNLCKE KGR / VGAC I 

QCH KANC YTAFH VT CAQKAGL YM KME P VKELTGGGTT FS VRKTA 

YCD VHT P PG CTR R P LN I YGDVEM KNG VCRKE S S VKT VRS TS KVR 

KKAKKAKKALAE PCAVLPTVCAP Y I P PQRLNR I ANQVAI QRKKQ 

FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 

KAAKEKLKYWQRLRHDLERARLLIELLRKREKLKREQVKVEQVA 

MELRLTPLTVLLRSVLDQLQDKDPARIFAQPVSLKEVPDYLDHI 

KH PMD F ATM RKRL EAQG YKNLH E F EE DFDL 1 1 DNCM K YNARDTV 

F YRAAVRLR DQGG WLRQARRE VDS I GLE EASGMHL PER P AAAP 

RRPFSWEDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 

SRSKRAKLLKKEIALLRNKLSQQHSQPLPTGPGLEGFEEDGAAL 

GPEAGEEVLPRLETLLQPRKRSRSTCGDSEVEEESPGKRLDAGL 

TNGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 

FNAPKCGRGKPALVRRHTLEDRSELISCIENGNYAKAARIAAEV 

GQSSMWISTDAAASVLEPLKVVWAKCSGYPSYPALIIDPKMPRV 

PGHHNGVTIPAPPLDVLKIGEHMQTKSDEKLFLVLFFDNKRSWQ 

WL P KS KM VPLG I D E TI DKLKMMEG RNS SIR KAVR I AFDRAMNHL 

SRVHGEPTSDLSDID 


5949 


39 


3370 


YRERYPVSUUSVLRSALEVCWDFLSGLTEGSLLPEGFFSGPIDQ 
GNHYQMRRKGRCHRGSAARHPSSPCSVKHSFTRETLTYAQAQRM 
VE I E I EGRLHR I S I FDPLE 1 1 LEDDLTAQEMSECNSNKBNS ERP 
PVCLRTKRHKNNRVKKKNEALPSAHGTPASASALPEPKVRIVEY 
SPPSAPRRPPVYYKFIEKSAEELDNEVEYDMDEEDYAWLEIVNE 
KRKGDCVPAVSQSMFEFLMDRFEKESHCENQKQGEQQSLIDEDA 
VCCI CMDGECQNSNVI LFCDMCNLAVHQECYGVP Y I PEGQWLC/ 
RAHCLQS RARPADC VLCPNKGGAFKKTDDDRWGHV\ VCALW \ I P 
E \ VG FANTVFIEP I DGVRNI PPARWKLT\ CNLCKE KGR/ VGAC I 
QCHKANCYTAFHVTCAQKAGLYM KME PVKELTGGGTTFS VRKTA 
YCDVHTPPGCTRRPLNIYGDVEMKNGVCRKESSVKTVRSTSKVR 
KKAKKAKKALAE P CAVL PT VCAP Y I P P QR LNR I ANQVAI QRKKQ 

FVERAHSYWLLKRLSRNGAPLLRRLQSSLQSQRSSQQRENDEEM 
KAAKEKLKYWQRLRHDLERARLLIELLRKREKLKREQVKVEQVA 
MELRLTPLTVLLRSVLDQLQDKDPARIFAQPVSLKEVPDYLDHI 
KH PMD FATM R KRLE AQG YKNLHE FEED FDL 1 1 DNCM K YNARDTV 
FYRAAVRLRDQGGWLRQARREVDSIGLEEASGMHLPERPAAAP 
RRPFSWEDVDRLLDPANRAHLGLEEQLRELLDMLDLTCAMKSSG 
SRSKRAKLLKKEIALLRNKLSQQHSQPLPTGPGLEGFEEDGAAL 
GPEAGEEVLPRLETLLQPRKRSRSTCGDSEVEEESPGKRLDAGL 
TNGFGGARSEQEPGGGLGRKATPRRRCASESSISSSNSPLCDSS 
FNAPKCGRGKPALVRRHTLEDRSELISCIENGNYAKAARIAAEV 
GQSSMWISTDAAASVLEPLKWWAKCSGYPSYPALIIDPKMPRV 
PGHHNGVTIPAPPLDVLKIGEHMQTKSDEKLFLVLFFDNKRSWQ 
WL P KS KM V PLG I DET I D KLKMMEG RNS S I RKAVR I AFDRAMNHL 
SRVHGEPTSDLSDID 


5950 


1166 


373 


ESRSLTMSTSQPGACPCQGAASRPAILYALLSSSLKAVPRPRSR " 

CLCRQHRPVQLCAPHRTCREALDVLAKTVAFLRNLPSFWQLPPQ 

DQRRLLQGCWGPLFLLGLAQDAVTFEVAEAPVPSILKKILLEEP 

SSSGGSGOLPDRPOPSIiAAVnWT»OPrT.T?QT71(JOT PT cdvp\ vnnr 

KGPILFNPDVPGLQAASHIGHLQQEAHWVLCEVLEPWCPAAQGR 
LTRVLLTASTLKSIPTSLLGDLFFRPIIGDVDIAGLLGDMLLLR 


5951 


143 


5449 


WNV K P S L L WQL F KFS D KE E HEQNDS I SGKTG E TG VE EM I ATRK 
VEQDSKETVKLSHEDDHILEDAGSSDISSDAACTNPNKTENSLV 
GLPSCVDEVTECNLELKDTMGIADKTENTLERNKIEPLGYCEDA 
ESNRQLESTEFNKSNLEWDTSTFGPESNILENAICDVPDQNSK 
QLNAIESTKIESHETANLQDDRNSQSSSVSYLESKSVKSKHTKP 
VI HS KQNMTTD AP KKI VAAK YE VIHS KTKVNVKS VKRNTD VPE S 
QQNFHRPVKVRKKQIDKEPKIQSCNSGVKSVKNQAHSVLKKTLQ 
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NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine', D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, ' 
H=Histidine, I=Isoleucine , K^Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P«Proline, Q-Glutamine , R«Arginine, 
S»Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, .*=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DQTLVQIFKPLTHSLSDKSHAHPGCLKEPHHPAQTGHVSHSSQK 

QCHKPQQQAPAMKTNSHVKEELEHPGVEHFKEEDKLKLKKPEKN 

LQ P RQRRS S KS FS LD E P PLF I PDNI AT I RREGSDHS S S FE S K YM 

WTPSKQCGFCKKPHGNRFMVGCGRCDDWFHGDCVGLSLSQAQQM 

GEEDKEYVCVKCCAEEDKKTEILDPDTLENQATVEFHSGDKTME 

CEKLGLSKHTTNDRTKYIDDTVKHKVKILKRESGEGRNSSDCRD 

NEIKKWQLAPLRKMGQPVLPRRSSEEKSEKIPKESTTVTCTGEK ' 

ASKPGTHEKQEMKKKKV\EKGVLNVHPAASASKPSADQIRQSVR 

HSLKDILMKRLTDSNLKVPEEKAAKVATKIEKELFSFFRDTDAK 

YKNKYRSLMFNLKDPKNNILFKKVLKGEVTPDHLIRMSPEELAS 

KE LAAWRRR ENRHT I EM I E KEQR E VERR P ITKITHKGEIEIESD 

APMKEQEAAMEIQEPAANKSLEKPEGSEK\RKEEVDSMSKDTTS 

QHRQHLFDLNCKICIGRMAPPVDDLSPKKVKVWGVARKHSDNE 

AESIADALSSTSNILASEFFEEEKQESPKSTFSPAPRPEMPGTV 

EVESTFLARLNFIWKGFINMPSVAKFVTKAYPVSGSPEYLTEDL 

PDS I QVGGR I S PQTVWDYVEKI KASGTKE I CWRFT PVTEEDQI 

S YTLLFAYFSSRKRYGVAANNMKQVKDMYLI PLGATDKI PHPLV 

PFDGPGLELHRPNLLLGLIIRQKLKRQHSACASTSHIAETPESA 

PPIALPPDKKSKIEVSTEEAPEEENDFFNSFTTVLHKQRNKPQQ 

NLQEDLPTAVEPLMEVTKQEPPKPLRFLPGVLIGWENOPTTLEL 

ANKPLPVDD I LQSLLGTTGQVYDQ\ AQS VMEQNTVKE I P FLNEQ 

TNS K I E KTDNVE VTDGENKE I KVKVDN I S ES TDKSAE I ETS WG 

SSS ISAGSLTSLSLRGKPPDVSTEAFLTNLSIQSKQEETVESKE 

KTLKRQLQEDQENNLQDNQTSNSSPCRSNVGKGNIDGNVSCSEN 

LVANTARSPQFINLKRDPRQAAGRSQPVTTSESKDGDSCRNGEK 

HMLPGLSHNKEHLTEQINVEEKLCSAEKNSCVQQSDNLKVAQNS 

PSVENIQTSQAEQAXPLQEDILMQNIETVHPFRRGSAVATSHFE 

VGNTCPSEFPSKSITFTSRSTSPRTSTNFSPMRPQQPNLQHLKS 

SPPGFPFPGPPNFPPQSMFGFPPHLPPPLLPPPGFG\FA\QNPM 

VPWPPW\HLP\GQPQRMMGPLSQASRYIGPQNFYQVKDIRRPE 

RRHSDPWGRQDQQQLDRPFNRGKGDRQRFYSDSHHLKRERHEKE 

WEQESERHRRRDRSQDKDRDRKSREEGHKDKERARLSHGDRGTD 

GKASRDSRNVDKKPDKPKSEDYEKDKEREKSKHREGEKDRDRYH 

KDRDHTDRTKSKR 


59^2 


3226 


639 


pparrsardlpralsmeaarpsgswngalcrllVlvtl\aflif 
asdacknvtlhvpskldaeklvgrvnlkecftaanlihssdpdf 

QILEDGSVYTTNTILLSSEKRSFTILLSNTENQEKKKIFVFLEH 
QTKVLKKRHTKEKVLRRAKRRWAPIPCSMLENSLGPFPLFLQQV 
QSDTAQNYTIYYSIRGPGVDQEPRNLFYVERDTGNLYCTRPVDR 
EQYES FE 1 1 AFATTPDG YTPELPLPL 1 1 KI EDENDN YP I FTEET 
YT FTI FENCRVGTTVGQVCATDKDE PDTMHTRLKYS I IGQVP PS 
PTLFSMHPTTGVITTTSSQLDRELIDKYQLKIKVQDMDGQYFGL 
QTTSTCIIN I DD VNDHLPT FTRTS YVTS VEENT VD VE ILR VTVE 
DKDLVNTANWRANYTILKGNENGNFKIVTDAKTNEGVLCWKPL 
N YEE KQQM ILQ IGWNEAP FS REAS PRSAMSTAT VTVNVEDQDE 
G PE CN P P I QT VRM KEN AE VGTTSNG YKAYDP E TRS S SG IR Y KKL 
TDPTGWVTI DENTGS I KVFRS LDREAETI KNG I YNI TVLAS DQG 
GRTCTGTLGI I LQDVNDNS PFI PKKTVI ICKPTMSSAEIVAVDP 
DEPIHGPPFDFSLESSTSEVQRMWRLKAINDTAARLSYQNDPPF 
GS YWP I TVRDRLGMS S VTS LD VTL CD C I TEND CTHR VD PR I GG 
GGVQLGKWAILAILLGIALFFCILFTLVCGASGTSKQPKVIPDD 
LAQQNL I VSNTE APGDDKVYS ANGFTTQTVGAS AQGVCGTVGSG 
IKNGGQETIEMVKGGHQTSESCRGAGHHHTLDSCRGGHTEVDNC 
RYTYSEWHSFTQPRLGEESIRGHTLIKN 


5953 


330 


811 


PLLCNPDPGWYWWVKQESEI S KESQEMDARPKLDLGFKEGQTI K " 
LC I GN I TNKKGGAS KP RTARGGGLS LL P P P PGG KVT IPPPSS/V 
KLPSTNHVTPPSIPKSNHGGSDADILLDLDSPAPVTTPAPTPVS 
VSNDLWGDFSTASSSVPNQAPQPSNWVQF 


5954 


32 


2130 


P P PPP PKLANMADLEAVLADVS YLMAME KS KATPAARASKR I VL 
PEPSIRSVMQKYLAERNEITFDKIFNQKIGFLLFKDFCLNEINE 
AVPQVKFYEEIKEYEKLDNEEDRLCRSRQIYDAYIMKELLSCSH 
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amino acid 
sequence 


Predicted end 
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location 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, OCysteine, D=Aspartic Acid, E= j 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine ( T=Threonine , V=Valine , 
W«Tryptophan, Y=Tyrosine f X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PFSKQAVEHVQSHLSKKQVTSTLFQPYIEEICESLRGDIFQKFM 
ESDKFTRFCQWKNVELNIHLTMNEFSVHRIIGRGGFGEVYGCRK 
ADTGKMYAMKCLNKKRIKMKQGETLALNERIMLSLVSTGDCPFI 
VCMTYAFKTPDKLCFILDLMNGGDLHYHLSQHGVFSEKEMRFYA 
TE I I LGLEHMHNRFWYRDLKPAN I LLDEHGHAR I 3 \DLGLACD 
FSKKKPHASVGTHGYMAPEVLQKGTAYDSSADWFSLGCMLFKLL 
RGHSPFRQHKTKDKHEIDRMTLTVNVELPDTFSPELKSLLEGLL 
QRDVSKRIX3CHGGGSQEVKEHSFFKGVDWQHVYLQKYPPPLIPP 
RGEVNAADAFDIGSFDEEDTKGIKLLDCDQELYKNFPLVISERW 
QQE VTETVYE AVNADTDKI EARKRAKNKQLGHEEDYALGKDCIM 
HGYMLKLGNPFLTQWQRRYFYLFPNRLEWRGEGESRQNLLTMEQ 
ILSVEETQIKDKKCILFRIKGGKQFVLQCESDPEFVQWKKELNE 
TF KE AQRLLRRA P KFLNK PRS GTVE L P KP S LCHRNSNGL 


5955 


1726 


444 


KREREFRLAVCPLRYPSAYESSPGTELRECGLCRSGQEFADCRR " 
PANRQDVLSGWINLPVLQLTKDPLKTPGRLDHGTRTAFIHHREQ 
VW KRC INI WRD VGL FG VLNE I ANS E E E VFE WV KTASGWALAL CR 
WAS S LHGS L F PHLS LRS E DL I AE FAQ VTN W S S CCLRV FAWH P HT 
NKFAVALLDDSVRVYNAS STI VPSLKHRLQRNVAS LAWKPLSAS 
VLA VACQS C I L I WTLDP T S LS TR P S S G CAQ VLSHPGHT P VTS LA 
WAPSGGRLLSAS PVDAAIR VWDVS TETCVPLPWFRGGG vtnllw 
S PDGSKI LATTPS AVFRVWEAQMWTCERWPTLSGRCQTGCWS PD 
GSRLLFTVLGEPLIYSLSFPERCGEGKG\ALEVQSQQRLWQICL 
RQQYRHQMVRRGLGERLTPWSGTPVGNVWLCL 


5956 


1705 


139 


GVGVRGARAMATVQEKAAALNLSALHSPAHRPPGFSVAQKPFGA 
TYVWSS I INTLQTQVEVKKRRHRLKRHNDCFVGSEAVDVI FSHL 
IQNKYFGDVDIPRAKWRVCQALMDYKVFEAVPTKVFGKDKKPT 
FEDS S CS LYR FTT I PNQDSQLGKENKLYSPAR YADALFKS S D I R 
SASLEDLWENLSLKPANSPHVNISATLSPQVINEVWQEETIGRL 
LQLVDLPLLDSLLKQQEAVPKIPQPKRQSTMVNSSNYLDRGILK 
AYSDSQEDEWLSAAIDCSEYLPDQMWEISRSFPEQPDRTDLVK 
ELLFDAIGRYYSSREPLLNHLSDVHNGIAELLVNGKTEIALEAT 
QLLLKLLDFQNREEFRRLLYFMAVAANPSEFKLQKESDNRMWK 
RIFSKAIVDNKNLSKGKTDLLVLFL\MDHQKDVFKIPGTL\HKI 
VS \ VK \ LMAI QNGRDPNRDAG YI YCQR I DQRDYSNNTEKTTKDE 
LLNLLKTLDEDSKLSAKEKKK\LLGQFYKCHPDIFIEHFGD 


5957 


1479 


451 


ELQVAVAMDTLDRWKPKTKRAKRFLEKREPKLNENI KNAMLI K 
GGNANATVTKVLKDVYAL KKP YG VL Y KKKN I TRP FEDQTS LE F F 
SKKSDCSLFMFGSHNKKRPNNLVIGRMYDYHVLDMIELGIENFV 
S LKD I KNS KCPEGTKPML I FAGDDFDVTED YRRLKS LLIDFFRG 
PTVSNIRLAGLEYVLHFTALNGKIYFRSYKLLLKKSGCRTPRIE 
LEEMGPSLDLVLRRTHLASDDLYKLSMKMPKALKPKKKKNISHD 
TFGTTYGRIHMQKQDLSKLQTRKM\KGLKKRPAERITEDHEKKS 
KR I KKKLME LS Q PLLFHCVLLKR I I KHQS I QS F L 


5958 


1 


3138 


AAALGMLLWFPACQAFNLDVEKLrVYSGPKGSYFGYAVDFHIPD ' 

ARTAS VLVGAPKANTSQPD I VEGGAVY YCPWPAEGS AQCRQ I P F 

DTTNNR K I R VNGTKE P I E F KSNQ W FG \ ATVKA\H KG KS CG P VAP 

LLFTWRNFLKPTPEKGPVGTCYVAIQNFSAYAEFSPCGNSNADP 

EGQGYCQAGFSLDFYKNGDLIVGGPGSFYWQGQVITASVADIIA 

N YS FKDI LRKLAGEKQTEVAPAS YDDS YLGYS VAAGEFTGDSQQ 

ELVAG I PRGAQNFGYVS I INS YDMTFIQNFTGEQMAS YFG YTW 

VSDVNSDGLDDVLVGAPLFMEREFESNPREVGQIYLYIiQVSSLL 

FRDPQILTGTETFGRFGSAMAHLGDLNQDGYNDIAIGVPFAGKD 

QRGKVLIYNGNKDGLNTKPFPKFCQGVWASHAVPSGFGFTLRGD 

SDIDKNDYPDLIVGAFGTGKVAVYRARPWTVDAQLLLHPMIIN 

LENKTCQVPDSMTSAACFSLRVCASVTGQSIANTIVLMAEVQLD 

SLKQKGAI KRTLFLDNHQAHRVFPLVTKRQKSHQCQDFIVYLRD 

ETEFRDKLSPINISLNY3LDESTFKEGLEVKPILNYYRENIVSE 

QAHILVDCGEDNLCVPDLKLSARPDKHQVI IGDENHLMLI INAR 

NE G EG A YE AE L F VM I P EEADYVG I E RNNKG FR P LS CE Y KMENVT 

RMWCDLGNPMVSGTNYSLGLRFAVPRLEKTNMSINFDLQIRSS 

NKDNPDSNFVSLQINITAVAQVEIRGVSHPPQIVLPIHNWEPEE 
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amino acid 
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amino acid 
residue of 
amino acid 
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Amino acid segment containing signal peptide 
{A-Alanine, C=Cy3teine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HaHistidine, I^Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=*Proline, Q=Glutamine, R=Arginine, 

Q-Cpri np T-Thrp^nino IMfal ino 
Lj-oci. me, i. — x iix cvjiixllt; , V = Valin6 f 

W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EPHKEEEVGPLVEHIYELHNIGPSTISDTILEVGWPFSARDEFL 
LYIFHIQTLGPLQCQPNPNINPQDIKPAASPEDTPELSAFLRNS 
T I PHLVRKRD VHWEFHRQS PAK I LNCTN I ECLQ I S CAVGRLEG 
GESAVLKVRSRLWAHTFLQRKNDPYALASLVSFEVKKMPYTDQP 
AKLPEGSIAIKTSVIWATPNVSFSIPLWVIILAILLGLLVLAIL 
J i*ft±jrtj\v_Vjrr r jJK/vKrr'y b,um LDKnsjuj. WDKT PEA 


5959 


1 


1166 


GTSGYAAQQLPSLLKEREFHLGTLNKVFASQWLNHRQWCGTKC 
NTLFWDVQTSQITKIPILKDREPGGVTQQGCGIHAIELNP5RT 
LLATGGDNPNSLAIYRLPTLDPVCVGDDGHKDWIFSIAWISDTM 
AVSGSRDGSMGLWEVTDDVLTKSDARHNVSRVPVYAHITHKALK 
DIP KE DTN P DNC KVRALA FNNKN KE LGA VS LDG YFHLWKAENTL 
SKLLSTKLPYCRENVCLAYGSEWSVYAVGSQAHVSFLDPRQPSY 
NVKSVCSRERGSGIRSVSFYEHIITVGTGQGSLLFYDIRAQRFIi 
EERLSACYGSKPRLAGENLKLTTG\KGWLNHDETWRNYFSDIDF 
FPNAVYTKCYDSSGTKLFVAGGPLPSGLHGNYAGLWS 


5960 


2853 


870 


FVWSDGGPRPRRGPAVGAGAAHLSDPWAMTPGTANRATNPLNKE 
LD WAS INGFCEQLNEDFEGPPLATRLLAHKI QS PQEWEAI QALT 
VLETCMKSCGKRFHDEVGKFRFLNELIKWSPKYLGSRTSEKVK 
NKILELLYSWTVGLPEEVKIAEAYQMLKKQG\IVKSDPKLPDDT 
TF PLP PPRP KNVI FEDEEKS KMLARLLKSSHPEDLRAANKLI KE 
MVQEDQKRMEKISKKVNAIEEVNimVKLLTEMVMSHSQGGAAAG 
SSEDL\MKEL\YQRCERMRPTLFPTGRVDTEDND\EALAEILQA 
NDNLTQVINLYKQLVRGEEVNGDATAGS I PGSTS ALLDLS GLDL 
PPAGTTYPAMPTRPGEQASPEQPSASVSLLDDELMSLGLSDPTP 
PSGPSLDGTGWNSFQSSDATEPPAPALAQAPSMESRPPAQTSLP 
ASS G LDDL D LLG KTLLQQS LP P E S QQ VRWE KQQ PTPRLTLRDLQ 
NKSSSCSSPSSSATSLLHTVSPEPPRPPQQPVPTELSLASITVP 
LES IKPSN I LP VTVYDQHG FRILFHFARDPLPGRSDVL WWSM 
LSTAPQPIRNIVFQSAVPKVMKVKLQPPSGTELPAFNPIVHPSA 
ITQVLLLANPQKEKVRLRYKLTFTMGDQTYNEMGDVDQFPPPET 
WGSL 


5961 


198 


3147 


SG E P R P E PGNMAT C IGE K I E D FKVGNLLG KGS FAG VYRAE S I HT ' 

GLEVAIKMIDKKAMYKAGMVQRVQNEVKIHCQLKHPSILELYNY 

FEDSNYVYLVLEMCHNGEMNR YLKNRVKPFSENEARHFMHQ I IT 

GML Y LH S HG I LHRDLTLS NLLLTRNMN I KI AD FG LATQL KM PHE 

KHYTLCGTPNYISPEIATRSAHGLESDVWSLGCMFYTLLIGRPP 

FDTDTVKNTLNKWLADYEMPTFLS I EAKDL IHQ L LRRNPADRL 

SLSSVLDHPFMSRNSSTKSKDLGTVEDSIDSGHATISTAITASS 

STSISGSLFDKRRLLIGQPLPNKMTVFPKNKSSTDFSSSGDGNS 

FYTQWGNQETSNSGRGRVIQDAEERPHSRYLRRAYSSDRSGTSN 

SQSQAKTYTMERCHSAEMLSVSKRSGGGENEERYS PTDNNANI F 

NFFKEKTSSSSGSFERPDNNQALSNHLCPGKTPFPFADPTPQTE 

T VQQ WFGNLQ I NAHLRKTTE YDS IS PNRDFQGH PDLQKDTS KNA 

WTDTKVKKNSD AS DNAHS VKQQNTMK YMTALHS KP E 1 1 QQE CVF 

GSDPLSEQSKTRGMEPPWGYQNRTLRSITSPLVAHRLKPIRQKT 

KKAWSILDSEEVCVELVKEYASQEYVKEVLQISSDGNTITIYY 

PNGG\RGFPLA\DRPPSPT\DNISR\YSF\DNLPEKYWRKYQYA 

SRFVQLVRSKSPKITYFTRYAKCILMENSPGADFEVWFYDGVKI 

HKTEDFIQVIEKTGKSYTLKSESEWSLKEEIKMYMDHANEGHR 

I CLALES 1 1 S EEERKTRSAPFFPI I IGRKPGSTSS PKALSPPPS 

VDSNYPTRDRASFNRMVMHSAASPTQAPILNPSMVTNEGLGLTT 

TASGTDISSNSLKDCLPK^AOTJ.K^VFVKTJVnwzVTOV T.TQf3avw 

VQFNDGSQLWQAGVSS ISYTSPNGQ\TTR\YGENEKLPDYI KQ 
KLQCLSSILLMFSNPTPNFH 


5962 


20 


2447 


RVCSS SASTASQAVMADAWEE I RRLAADFQRAQFAEATQRLS ER 
NCIEIVNKLIAQKQLEWHTLDGKEYITPAQISKEMRDELHVRG 
GRVNIVDLQQVINVDLIHIENRIGDIIKSEKHVQLVLGQLIDEN 
YLDRLAE EVNDKLQESGQVT I S ELCKTYDLPGNFLTQALTQRLG 
R I ISGHIDLDNRGV I FTEAFVARHKAR I RGLFSAI TRPTAVNS L 
ISKYGFQEQLLYSVLEELVNSGRLRGTWGGRQDKAVFVPDIYS 
RTQSTWVDS FFRQNG YLEFDALSRLG I PDAVS YI KKR YKTTQLL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide""" 
(A^Alanine, C«Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F»Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine , N=Asparagine / 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, **Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








FLKAACVGQGLVDQVEASVEEAISSGTWVDIAPLLPTSLSVEDA 
AILLQQVMRAFSKQASTWFSDTVWSEKF\INDCTELFRELMH 
QKAEKEMKNNPVHLITEEDLKQISTLESVSTSKKDKKDERRRKA 
TEGSGSMRGGGGGNAREYKIKKVKKKGRKDDDSDDESQSSHTGK 
KKPE I S FMFQDE I EDFLRKHIQDAPEEFI SELAE YLI KPLNKTY 
LEWRS VFMS S TTSASGTGRKRTI KDLQEEVSNL YNNI RLFEKG 
MKFFADDTQAALTKHLLKS VCTD ITNLI FNFLASDLMMAVDDPA 
A I TS E I R KKI L S KLS EETKVALT KLHNSLNE KS I ED F I S CLDS A 
AEACDIMVKRGDKKRERQILFQHRQALAEQLKVTEDPALILHLT 
S VLL FQ FS THS MLHAPGRC V PQ 1 1 AFLNS KIP E DQHAL L VK YQG 

LWKQLVSQSKKTGQGDYPLNNELDKEQEDVASTTRKELQELSS 
S I KDLVLKSRKSS VTEE 


5963 


62 


1130 


PWNPQDFPGNRGLMG\QKGEIGPP\GQQGKKGAPGMP\GLMGSN 
GSPGQPGTPGSKGSKGEPGIQGMPGASGLKGEPGATGSPGEPGY 
MGLPG I QGKKGDKGNQGEKGIQGQKGENGRQG I PGQQG IQGHHG 
AKGERGEKGEPGVRGAIGSKGESGVDGLMGPAGPKGQPGDPGPQ 
GPPGLDGKPGREFSEQFIRQVCTDVIRAQLPVLLQSGRIRNCDH 
CLSQHGSPGIPGPPGPIGPEGPRGLPGLPGRDGVPGLVGVPGRP 
GVRGLKGLPGRNGEKGSQGFGYPGEQGPPGPPGPEGPPGISKEG 
PPGDPGLPGKDGDHGKPGIQGQPGPPGICDPSLCFSVIARRDPF 
RKGPNY 


59^4 


3 


2147 


SCRTRGRLSPLQPREAGSSRGSRARSEPPRPGGMEEACQVQTTK 
RGDPHELRNIFLQYASTEVDGERYMTPEDFVQRYLGLYNDPNSN 
PKIVQLLAGVADQTKDGLISYQEFLAFESVLCAPDSMFIVAFQL 
FDKSGNGEVTFENVKEIFGQTIIHHHIPFNWDCEFIRLHFGHNR 
KKHLNYTEFTQFLQELQLEHARQAFALKDKSKSGMISGLDFSDI 
MVTIRSHMLTPFVEENLVSAAGGSISHQVSFSYFNAFNSLLNNM 
EL VR K I YS TLAGTR KD AEVT KE E FAQ S A I R YGQAT PLE I D I L YQ 
LADLYNASGRLTLADIERIAPLAEGALPYNLAELQRQQSPGLGR 
PIWLQIAESAYRFTLGSVAGAVGATAVYPIDLVKTRMQNQRGSG 
SWGELMYKNSFDCFKKVLRYEGFFGIiYRGLIPQLIGVAPEKAI 
KLTVNDFVRDKFTRRDGSVPLPAEVLAGGCAGGSQVIFTNPLEI 
VKI RLQ VAGE I TTG PRVSALNVLRDLG I FGLYKG AKACFLRD I P 
FSAIYFPVYAHCKLLLADENGHVGGLNLIAAGAMAG\VPAASLV 
TPADVIKTRLQVAARAGQTTYSGVIDCFRKIL\REEGPSAFWKG 
TAARVFRSS PQFG \ VTL VT YELLQRG FY I D FGG L KPAGS EPTP K 

SRIADLPPANPDHIGGYRLATATFAGIENKFGLYLPKFKSPSVA 
WQPKAAVAATQ 


5965 


1 


1498 


MVT WL YRFL P TSNMAAKLRS LL P PDLRLQFWLHARIjQ KCFLSRG 
CGSYCAGAKASPIiPGKMAMGLMCGRRELLRLLQSGRRVHSVAGP 
SQWLGKPLTrRLLFPAAPCCCRPHYLFLAASGPRSLSrSAISFA 
EVQVQAPPWAATPS PTAVPEVASGETADWQTAAEQS FAELGL 
G S YTP VGL I QNLLE FMHVDLGL P WWGA I AACT VFARCL I FPL IV 
TGQREAARIHNHLPEIQKFSSRIREAKLAGDHIEYYKASSEMAL 
YQKKHGIKLYKPLILPVTQAPIFISFFIALREMANLPVPSLQTG 
GLWWFQDLTVSDPIYILPLAVTATMWAVLELGAETGVQSSDLQW 
MRNVIRMMPLITLPITMHFPTAVFMYWLSSNLFSLVQVSCLRIP 
AVRTVLKIPQRWHDLDiCLPPREGFLESFKKGWKNAEMTRQLRE 
REQRMRNQLELAARGPLRQTFTHNPLLQPGKDNPPNIPSS \SSS 
SSKPKSKYPWHDTLG 


5966 


102 


1925 


RS KQVMARLTKRRQADTKAI QHLWAAI E 1 1 RNQKQ IAN I DR ITK 
YMS R VHGMHP KE TTRQ LS LAVKDG L I VE TLT VG CKG S KAG I EQK 
GYWLPGDEIDWETENHDWYCFECHLPGEVLICDLCFRVYHSKCL 
SDEFRLRDSSSPWQCPVCRSIKKKNTNKQEMGTYLRFIVSRMKE 
RAIDLNKKGKDNKHPMYRRLVHSAVDVPTIQEKVNEGKYRSYEE 
FKADAQLLLHNTVIFYGADSEQADIARMLYKDTCHEL\DELQLC 
KNCFYLANARPDNWFCYPC I PNHELDWAKMKGFGFWPAKVMQKE 
DNQ VD VRF FGHHHQRAW I P S EN I QD I TVN I HRLHVKRS MG WKKA 
CDELELHQRFLREGRFWKSKNEDRGEEEAESSISSTSNEQLKVT 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
S VS TQT KKLS AS S PRMLHRS TQTTNDGVCQSMCHDKYTKI FNDF 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


•miixtju dt.iu segment, containing signal peptide 
(A*Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«=»Histidine, Islsoleucinp v-_t A7C . j 
L«Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, Rr=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KDRMKSDHKRETERWREALEKLRSEMEEEKRQAVNKAVANMQG 
EMDRKCKQVKEKCKEEFVEEIKKLATQHKQLISQTKKKQWCYNC 
EEEAMYHCCWNTSYCSIKCQQEHWHAEHKRTCRRKR 


5967 


102 


1925 


rskqvmarltkrrqadtkaiqhlWaaie t irnqkqianidri'tk 

YMSRVHGMHPKETTRQLSLAVKDGLIVETLTVGCKGSKAGIEQE 

GYWLPGDEIDWETENHDWYCPECHLPGBVLICDLCFRVYHSKCL 

S DE FR LRD S S S P WQ CP VCRS I KKKNTNKQ EMGTYLR F I VS RM KE 

RAIDLNKKGKDMFCHPM VP R t.vwq avn^/OTTAcin mor>irvn 

* vri ■*• rv*\v?iVL/iN Mirro i kk jj vno A VUVir i IQEKVNEGKYRSYEE 

FKADAQLLLHNTVIFYGADSEQADIARMLYKDTCHEL\DELQLC 
KNCFYLANARPDNWFCYPCIPNHELDWAKMKGFGFWPAKVMQKE 
DNQVDVRFFGHHHQRAWIPSENIQDITVNIHRLHVKRSMGWKKA 
CDELELHQRFLREGRFWKSKNEDRGEEEAESSISSTSNEQLKVT 
QEPRAKKGRRNQSVEPKKEEPEPETEAVSSSQEIPTMPQPIEKV 
SVS TQTKKLSAS S PRMLHRS TQTTNDG V CQ S M CH DK YT KI FNDF 
KDRMKSDHKRETERWREALEKLRSEMEEEKRQAVNKAVANMQG 
EMDRKCKQVKEKCKEEFVEEIKKLATQHKQLISQTKKKQWCYNC 
EEEAM YHCCWNTS YCS I KCQQEHWHAEHKRTCRRKR 


5968 


81 


1288 


vi\r riu^wkjrti'f i vui roKUytavrljGPQRPGSEPDIPARGQPHPP 
RPVGVSTSAQAQVQPPAMHRRRLALGLGFCLLAGTSLSVLWVYL 
ENWLPVSYVPYYLPCPEIFNMKLHYKREKPLQPWWSQYPQPKL 
LEHRPTQLLTLT P WLAP I VS EGTFNPELLQH I YQ PLNLT IGVTV 
FAVGN/HFLESAEEFFMRGYRVHYYIFTDNPAAVPGVPLGPHRL 
QJ,r vi '"-'fiwii »> Ci Hi j. ^rjKKncj 1 XbynlAKKAHREVDYIjFCLDVD 

MVFRNPWGPETLGDLVAAIHPSYYAVPRQQFPYERRRVSTAFVA 

DSEGDFYYGGAVFGGQVARVYEFTRGCHMAILADKANGIMAAWR 

EESHLNRHFISNKPSKVLSPEYLWDDRKPQPPSLKLIRFSTLDK 
DISCLRS 


5969 


1126 


503 


DVGFNIKRKRCDLDVFLESPRKPSGRRDRAPEKQRRIAANKCLC - 
TG VREG E PP S / TTSQKVKEAGRDFTYL I WLFG I S I TGGLFYT I 
FKELFSSSSPSKIYGRALEKCRSHPEVIGVFGESVKGYGEVTRR 
GRRQHVRFTEYVKDGLKHTCVKFYIEGSEPGKQGTVYAQVKENP 
GSGEYDFRYIFVEIESYPRRTIIIEDNRSQDD 


5970 


316 


4712 


SQDNIGHRLLQKHGWKLGQGLGKSLQGRTDPIPIWKYDVMGMG 

RMEMELDYAEDATERRRVLEVEKEDTEELRQKYKDYVDKEKAIA 

KALEDLRANFYCELCDKQYQKHQEFDNHINSYDHAHKQRLKDLK 

QREFARNVSSRSRKDEKKQEKALRRLHEIAEQRKQAECAPGSGP 

MFKPTTVAVDEEGGEDDKDESATNSGTGATASCGLGSEFSTDKG 

G P FTAVQ I TNTTGLAQAPGLASQG I S FG I KNNLGTPLQKLGVS F 

SFAKKAPVKLESIASVFKDHAEEGTSEDGTKPDEKSSDQGLQKV 

GDSDGSSNLDGKKEDEDPQDGGSLASTLSKLKRMKREEGAGATE 

PEYYHYIPPAHCKVKPNFPFLLFMRASEQMDGDNTTHPKNAPES 

KKGSSPKPKSCI KAAAS QG AE KT VS E VS EQP KETSMT E P S E PGS 

KAEAKKALGGDVSDQSLESHSQKVSETQMCESNSSKETSLATPA 

GKESQEGPKHPTGPFFPVLSKDESTALQWPSELLIFTKAEPSIS 

YSCNPLYFDFKLSRNKDARTKGTEKPKDIGSSSKDHLQGLDPGE 

PNKSKEVGGEKIVRSSGGRMDAPASGSACSGLNKQEPGGSHGSE 

TEDTGRSLPSKKERSGKSHRHKKKKKHKKSSKHKRKHKADTEEK 

SSKAESGEKSKKRKKRKRKKNKSSAPADSERGPKPEPPGSGSPA 

PPRRRRRAQDDSQRRSLPAEEGSSGKKDEGGGGSSSQDHGGRKH 

KGELPPSSCQRRAGTKRSSRSSHRSQPSSGDEDSDDASSHRLHQ 

KSPSQYSEEEEEEDSGSEHSRSRSRSGRRHSSHRSSRRSYSSSS 

DASSDQSCYSRQRSYSDDSYSDYSDRSRRHSKRSHDSDDSDYAS 

SKHRSKRHKYSSSDDDYSLSCSQSRSRSRSHTRERSRSRGRSRS 

SSCSRSRSKRRSRSTTAHSWQRSRSYSRDRSRSTRSPSQRSGSR 

KRSWGHESPEERHSGRRDFIRSKIYRSQSPHYFRSGRGEGPGKK 

DDGRGDDSKATGPPSQNSNIGTGRGSEGDCSPEDKNSVTAKLLL 

EKIQSRKVERKPSVSEEVQATPNKAGPKLKDPPQGYFGPKLPPS 

LGNKPVLPLIGKLPATRKPNKKCEESGLERGEEQEQSETEEGPP 

GSSDALFGHQFP\SESTTGPLLDPPPEESKSGEVTADHPVAPLG 

PPAHFDCYLGDPT1SHNYLPDPSDGNTLESXJ3SSSQPGPVESSL 

LPIAPDLEHFPSYAPPSGDPSIESTDGAEDA\SLAPLESQPITF 
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SEQ 
ID 
NO: 


Predicted 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=* Phenylalanine , G=Glycine, 
H-Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TPEEMEKYSKLQQAAQQHIQQQLLAKQVKAFPASAAIiAPATPAL" 

QPIHIQQPATASATSITTVQHAILQHHAAAAAAAIGIHPHPHPQ 

PLAQVHHIPQPHLTPISLSHLTHSIIPGHPATPLASHPIHIIPA 

SAIHPGPFTFHPVPHAALYPTLLAPRPAAAAATALHLHPLLHPI 

FSGQDLQHPPSHGT 


5971 


53 


2149 


SFLYFVGVDMDNPIGNWDGRFDGVQLCSFACVESTILIiHINDII " 

P E S VTQE RR P PKLA FMS RG VGDKGS S S HNK PKATG S TS DPGNRN 

RSELFYTLNGSSVDSQPQSKSKNTWYIDEVAEDPAKSLTEISTD 

FDRSSPPLQPPPVNSLTTENRFHSLPFSLTKMPNTNGS IGHSPL 

SLSAQSVMEELNTAPVQESPPIiAMPPGNSHGLEVGSLAEVKENP 

PFYGVIRWIGQPPGLNEVLAGLELEDECAG\CTDGTF/REGTRY 

FTCALKKALFVKLKSCRPDSRFASLQPVSNQIERCNSLAIWEAY 

LSEWEENTPTQKWEKEGLEIMIG\KKKGIQGHYNSCYLDSTLF 

CLFAFSSVLDTVLLRPKEKNDVEYYSETQELLRTEIVNPLRIYG 

YVCATKIMKLRKILEKVEAASGFTSEEKDPEEFLN1LFHHILRV 

EPLLKIRSAGQKVQDCYFYQIFMEKNEKVGVPTIQQLLEWSFIN 

SNLKFAEAPSCLIIQMPRFGKDFKLFKKIFPSLELNITDLLEDT 

PRQCRICGGLAMYECRECYDDPDISAGKIKQFCKTCNTQVHLHP 

KRLNHKYNPVSLPKDLPDWDWRHGCIPCQNMELFAVLCIETSHY 

VAFVKYGKDDSAWLFFDSMADRDGGQNGFNIPQVTPCPEVGEYL 

KMSLEDLHSLDSRRIQGCARRLLCDA1YVPCTQSPTMSLYK 


5972 


440 


1761 


ILLAGSPSPRDQCSQRQSSGGDKELVTRGCTFSTAWSPSAMTQ " 

EPFREELAYDRMPTLERGRQDPASYAPDAKPSDLQLSKRLPPCF 

SHKTWVFSVLMGSCLIiVTSGFSLYLGNVFPAEMDYLRCAAGSCI 

PSAIVSFTVSRRNANVIPNFQILFVSTFAVTTTCLIWFGCKLVL 

NPSAININFNLXLLLLLELLMAATVI1AARSSEEDCKKKKGSMS 

DSAN I LDEVP FPARVLXS YS WE VI AG I SAVLGG 1 1 ALNVDDS V 

SGPHLSVTFFWILVACFPSAIASHVAAECPNKCLVEVLIAISSL 

TSPLLFTASGYLSFSIMRIVEMFKDYPPAIKPSYDVLLLLLLLV 

LLLQA/GPQHGHRHPVRALQGQCKAAGCILGHPERPAGAPGWGG 

GQE P P EGVRQG E S LES RRG ANG P VT PRRGNR VAAP S LAPGMETH 

NP 


5973 


65 


• 2007 


NGDGKDLFGH I WAWRSNG 1 1 SNFRRS PHAGMAEDE PDAKS PKTG 
GRAP PGGAEAGEPTTLLQRLRGTI S KAVQNKVEG I LQD VQKFS D 
NDKLYLYLQLPSGPTTGDKSSEPSTLSNEEYMYAYRWIRNHLEE 
HTDTCLPKQSVYDAYRKYCESLACCRPLSTANFGKI IREI FPDI 
KARRLGGRGQSKYCYSGIRRKTLVSMPPLPGLDLKGSESPEMGP 
EVTPAPRDELVEAACALTCDWAERILKRSFSSIVEVARFLLQQH 
LISARSAHAHVLKAMGLAEEDEHAPRERSSKPKNGLENPEGGAH 
KKPERLAQPPKDLEARTGAGPLARGBRKKSWESSAPGANNLQV 
NALVARLPLLLPRAPRSLI PPI PVSPPILAPRLSSGALKVATLP 
LSSRAGAPPAAVPIINMILPTVPALPGPGPGPGRAPPGGLTQPR 
GTENRE VG I GGDQGPHDKG VKRTAEVP VS EASGQAP PAKAAKQD 
IEDTASDAKRKRGRPLKKSGGSGERNSTPLKSAAAMESAQSSRL 
PWETWGSGGEGNSAGGAERPGPMGEABKGAVLAQG\QGDGTVSK 
GGRGPGS QHTKEAEDK I PL VPS KVS VI KGSRSQKEAFPLAKGEV 
DTAPQGNKDLKEHVLQSSLSQEHKDPKATPP 


5974 


4293 


2200 


LGLQMHTTSGRIHQAMVTSLNEDNESVTVEWIENGDTKGK\EID 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV\ASIKNDPPS\RDNRWGSARARPSQFPEQFSSAQQNGSV\S 
DISPVQAAKKEFGPPSRRKSNCVKEVEKLQEKREKRRLQQQELR 
E KRAQ D VDATNPN YE I M CM IRD FRGSLD YR PLTTAD P I D EHR I C 
VKl\Kir , ljNKKE» aQMKDLDVITI pskdwiwhepkqkvdltryl 
ENQTFRFDYAFDDSAPNEMVYRFTARPLVETIFERGMATCFAYG 
QTGSGKTHTMGGDFSGKNQDCS KGI YALAARD VFLMLKKPNYKK 
LELQVYATFFE I YSGKVFDLLNRKTKLRVLEDGKQQVQ WGLQE 
RE VKCVED VLKL ID I GNS CRTSGQTSANAHS SRSHAVFQ 1 1 LRR 
KGKLHGKFS L I DLAGNERGADTS S ADRQTRLEGAE INKS LLALK 
ECIRALGRNKPHTPFRASKLTQVLRDSFIGENSRTCMIATISPG 
MAS C ENTLNTLR YANRVKE LTVD PTAAGDVR P I MHHP PNQ I \ DD 
LETQWGVGSSPQRDDLKXjLCEQNEEEVSPQLFTFHEAVSQMVEM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D»Aspartic Acid, E= 
Glutamic Acid. FaPhenvlalaninp rt-m i/-r--J n#» 
H=Hietidine, I=Isoleucine , K=Lysine, 
L*=Leucine, M=Methionine , N^Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=*Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EEQWEDHRAVFQESIRWLEDEKALLEMTEEVDYDVDSYATQLE 
AILEQKIDILTELRDKVKSFRAALQEEEQASKQINPKRPRAL 


5975 


4293 


2200 


LGLQMHTTSGR I HQAM VTS LNE DNE S VTVE WI ENGDT KGK \ E I D 
LESIFSLNP\DL\VPDGEIEPSP\ETPPPPASSAKVNKIVKNRR 
TV\ASIKNDPPS\RDNRWGSARARPSQFPEQFSSAQQNGSV\S 
DISPVQAAKKEFGPPSRRKSNCVKEVEKLQEKREKRRLQQQELR 
EKRAQDVDATNPNYE I MCM I RD FRGSLDYRPLTTADP I DEHR I C 
VC VRKR PLNKKETQM KDLD V I T I PS KD WMVHE P KQ KVDLTR YL 
ENOTFRFDYAFDD^APNFMVYP PTfiR dt .vft TPPonvuTpyB -vn 

QTGSGKTHTMGGDFSGKNQDCSKGIYALAARDVFLMLKKPNYKK 
LELQVYATFFEIYSGKVFDLLNRKTKLRVLEDGKQQVQWGLQE 
REVKCVEDVLKLIDIGNSCRTSGQTSANAHSSRSHAVFQIILRR 
KGKLHGKFSLIDLAGNERGADTSSADRQTRLEGAEINKSLLALK 
ECIRALGRNKPHTPFRASKLTQVLRDSFIGENSRTCMIATISPG 
MAS CENTLNTLR YANR VKE LT VD PTAAGD VR P I MHH P PNQ I \ DD 
LETQWGVGSSPQRDDLKLLCEQNEEEVSPQLFTFHEAVSQMVEM 
EEQWEDHRAVFQESIRWLEDEKALLEMTEEVDYDVDSYATQLE 
AILEQKIDILTEIiRDKVKSFRAALQEEEQASKQINPKRPRAL 


5976 


20 


2949 


vhhlhltrvswvnldiilriaqqmg1ktlnlvlg\lkra\lef " 

PEVSWMEVKD PNMKGAMLTNTG K YA I PT I D A\ E A YA I G KKE KP P 
FLPEEPSSSSEEDDPIPDELLCLICKDIMTDAWIPCCGNSYCD 
ECIRTALLESDEHTCPTCHQNDVSPDALIANKFLRQAVNNFKNE 
TGYTKRLRKQLPSPPPPIPPPRPLIQRNLQPLMRSPISRQQDPL 
MIPVTSSSTHPAPSISSLTSNQSSLAPPVSGNPSSAPAPVPDIT 
ATVSISVHSEKSDGPFRDSDNKILPAAALASEHSKGTSSIAITA 
LMEEKGYQVPVLGTPSLLGQSLLHGQLIPTTGPVRINTARPGGG 
R PG W E H SN KLG Y L VS P P QQ I RRGE RS CY RS I NRGRHH S E RS QRT 
QGPSLPATPVFVPVPPPPLYPPPPHTLPLPPGVPPPQFSPQFPP 
GQP\PPAGYSVPPPGFPPAPANLSTPWVSSGVQTAHSNTIPTTQ 
APPLSREEFYREQRRLKEEEKKKSKLDEFTNDFAKELMEYKKIQ 
KERRRSFSRSKSPYSGSSYSRSSYTYSKSRSGSTRSRSYSRSFS 

RSHSRSYSRSPPYPRRflRRV^RNVTJQDCDCurYUDODOnonnvn 

RYHSRSRSPQAFRGQSPNKRNVPQGETEREYFNRYREVPPPYDM 
KAYYGRSVDFRDPFEKERYREWERKYREWYBKYYKGYAAGAQPR 
PSANRENFSPERFLPLNIRNSPFTRGRREDYVGGQSHRSRNIGS 
N YP EKLS ARDGHNQKDNTKS KEKES EWAPGDG KGNKHKKHR KRR 
KGEESEGFLNPELLBTSRKSREPTGVEENKTDSLFVLPSRDDAT 
PVRDEPMDAESITFKSVSEKDKRERDKPKAKGDKTKRKNDGSAV 
SKKENIVKPAKGPQEKVDG\DVRDLLDLNL\QLKKPKEETPKDL 
TILNHHLPLRRMKKSL\EPP\EKLTLNOOK\TPRWTCTQnpr!ifCT? 
EGLFQRCQIRKANN 


5977 


1363 


1336 


FLEDRGQVLSHFQCLSLHS I NH I Uk P GAG VAAG PATGW /REYLT 
PVLKESKFKETGVITPEEFVAAGDHLVHHCPTWQWATGEELKVK 
AYLPTGKQFLVTKNVPCYKRCKQMEYSDELEAIIEEDDGDGGWV 
DTYHNTG I TGI TEA VKE 1 TLENKDNI RLQDCSALCEEEEDEDEG 
E AADME E YEE S GLLETDE ATLDTRKI VEACKAKTD AGG EDAI LQ 
TRTYDLYITYDKYYQTPRLWLFGYDEQRQPLTVEHMYEDISQDH 
VKKTVT I ENHPHLPP PPMCS VHPCRHAEVMKKI I ETVAEGGGEL 
GVHMYLL I FLKFVQAVI PT I E YD YTRHFTM 


5978 


160 ! 


3213 


RDGARR WGGCQ S PLTWAPG F YRRFDLATSGRRLRGQTAEPAGRQ " 
RPRREPEAMDEQSVESIAEVFRCFICMEKLRDARLCPHCSKLCC 
FS CIRRWLTEQRAQCPHCRAPLQIiRELVNCRWAE E VTQQLDTLQ 
LCSLTKHEENEKDKCENHHEKLSVFCWTCKKCICHQCALWGGMK 
GGHTFKPLAE I YE QHVTKVNE E VAKLRRRLMEL I SLVQE VERNV 
EAVRNAKDERVRE I RNAVEMMIARLDTQLKNKLITLMGQKTSLT 
QETELLESLLQEVEHQLRSCSKSELISKSSEILMMFQQVHRKPM 
AS F VTT P VP PDFT S EL VPS YD S ATFVLENF S TLRQRAD P VYS P P 
LQ VS GL C WR LKV Y PDGNG WRG Y YLS VFLELS AGL PETS K YE YR 
VEMVHQSCNDPTKNIIREFASDFEVGECWGYNRFFRLDLLANEG 
YLNPQNDTVILRFQVRSPTFFQKSRDQHWYITQLEAAQTSYIQQ 
INNLKERLT I ELSRTQKSRDLS P PDNHLS PQNDDALETRAKKSA 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rICUlLUCU C11U 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I«»Isoleucine, K=Lysine, 
LsLeucine, M-Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








CSDMLLER\GPYSAS\VREAKEDEEDEEKIQNEDYHHELSDGDIi 
DLDLVYEDEVNQLDGSSS SAS STATSNTEENDI DEETMSGENDV 

ATSSLLDIDPLILIHLLDLKDRSSIENLWGLQPRPPASIjLQPTA 
SYSRKDKDQRKQQAMWRVPSDLKMLKRLKTQMAEVRCMKTDVKN 
TLSEIKSSSAASGDMQTSLFSADQAALAACGTENSGRLQDLGME 
LLAKSSVANCYIRNSTNKKSNSPKPARSSVAGSLSLRRAVDPGE 
NSRSKGDCQTLSEGSPGSSQSGSRHSSPRALIHGSIGDILPKTE 
DRQCKALDSDAVWAVFSGLPAVEKRRKMVTLGANAKGGHLEGL 
QMTDLENNSETGELQPVLPEGASAAPEEGMSSDSDIECDTENEE 
QEEHTSVGGFHDSFMVMTQPPDEDTHSSFPDGEQIGPEDJjSFNT 
DENSGR 


5979 


212 


3665 


LPDMTMYLWLKLLAFGFAFLDTEVFVTGQSPTPSPTDAYLNASE 
TTTLSPSGSAVISTTTIATTPSKPTCDEKYANITVDYLYNKETK 
LFTAKLNVNENVECGNNTCTNNEVHNLTECKNASVSISHNSCTA 
PDKTLILDVPPGVEKVPVHCCS\QVEQPDSTIWLKWKNIETSTC 
DTQN I T YR FQ CGNM I FDN KE I KL ENLE P EHE Y KCD S E I L YNS HK 
FTNAS KIIKTD FGSPGE PQI I FCRS EAAHQG VI TWNPPQRS FHN 
FTLC Y I KETE KDCLNLDKNL I K YD LQNLKP YT K YVLS LHAY I I A 
KVQRNGSAAMCHFTTKSAPPSQVWNMTVSMTSDNSMHVKCRPPR 
DRNGPHERYHLEVEAGNTLVRNESHKNCDFRVKDLQYSTDYTFK 
AYFHNGDYPGEPFILHHSTSYNSKALIAFLAFLIIVTSIALLVV 
LYKIYDIjHKKRSCNLDEQQELVERDDEKQLMNVEPIH7\DILLET 
YKRKI ADEGRLFLAEFQS 1 PRVFS KFP I KEARKPFNQNKNRYVD 
ILPYDYNRVELSEINGDAGSNYINASYIDGFKEPRKYIAAQGPR 
DETVDDFWRMIWEQKATVIVMVTRCEEGNRNKCAEYWPSMEEGT 
RAFGECCCKDLTKHKRCP\DYIIQKLNIVNKKEKATGREVTHIQ 
FTSWPDHGVPEDPHLLLKLRRRVNAFSNFFSGPIWHCSAGVGR 
TGTYIG I DAMLEGLEAENKVDVYG YWKLRRQRCLM VQVEAQY I 
jj±rtU>uj v k. i in y i? J. h, VNIjS EbHP YLHNMKKRDPP SEPSPLEAE 
FQRLPSYRSWRTQHIGNQE\ENKSKNRNSNVIPYDYNRVPLKHE 
LEMSKESEHDSDESSDDDSDSEEPSKYINASFIMSYWKP\EVMI 
AAQGPLKETIGDFWQMIFQRKVKVIVMLTELKHGDQEICAQYWG 
EGKQTYGDIEVDLKDTDKSSTYTLRVFELRHSKRKDSRTVYQYQ 
YTNWSVEQLPAEPKELISMIQVVKQKLPQKNSSEGNKHHKSTPL 
LIHCRDGSQQTGIFCALLNLLESAETEEWDIFQWKALRKARP 
GMVSTFEQYQFLYDVIASTYPAQNGQVKKNNHQEDKIEFDNEVD 
KVKQDANC VNP LGAP EKL P E AKEQ AEGS E PTSGTEG P EHS VNG P 
ASPALNQGS 


5980 


3 


2363 


DAWGCKLRRLRFTYGTQTRVSLALPGQYEIjVHTIjVAHQGNWETI 
PE EDLE VQ ENNE D AAHDLTEL E VTMHHALLQ E VD VWA? CQGLR 
PTVDVLGDLVNDFLPVITYALHKDELSERDEQELQEIRKYFSFP 
VFFFKVPKLGSEIIDSSTRRMESERSPLYRQLIDLGYLSSSHWN 
CGAPGQDTKAQSMLVEQSEKLRHLSTFSHQVLQTRLVDAAKALN 
LVHCHCLDIFINQAFDMQRDLQITPKRLEYTRKKENELYESLMN 
IANRKQEEMKDMIVETLNTMKEELLDDATNMEFKDVIVPENGEP 
VGTREIKCCIRQIQEIiIISRLNQAVANKLISSVDYLRESFVGTL 
ERCLQS LEKSQD VS VHI TSN YLKQ I LNAAYHVE VTFHSGS S VTR 
MLWEQIKQIIQRITWVSPPAITLEWKRKVAQEAIESLSASKLAK 
SICSQFRTRLNSSHEAFAASLRQLEAGHSGRLEKTEDLWLRVRK 
DHAPRLARLS LE S RS LQD VLLHR KP KLGQ ELG RGQ YG WYLCDN 
WGGHFPCALKSWPPDEKHWNDLAI.FFHYMRQT.PirRTypT imr up 
SVIDYNYGGGSSIAVLLIMERLHRDLYTGLKAGLTLETRLQIAL 
D WEG I R FLHS QG L VHRD I KL KNVL LDKQNRAKI TDLG FCKPE A 
MMSGSIVGTPIHMAPELFTGKYDNSVDVYAFGILFWYICSGSVK 
L P E AFE RCAS KDHL WNNVRRGAR PE RLP VFDEE CWQLME AC WDG 
DPLKRPLLGIVQPMLQGIMNRLCKS\NSEQPNRGLDDST 


5981 


1 


2519 


GRRHSAAMERPWGAADGLSRWPHGtGLLLLLQLLP^STLSQDRt 
DAPPPPAAPLPRWSGPIGVSWGLRAAAA\GGAFPRGGRWRRSAP 
G\EDEECGRVRDFVAICLANNTHQHVFDDLRGSVSLSWVGDSTGV 
ILVLTTFHVPLVIMTFGQSKLYRSEDYGKNFKDITDLINNTFIR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


muj.ii.u ch_.lu segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aapartic Acid, E= 
Glutamic Acid, P«= Phenylalanine, G^Glycine, 
H«Histidine, I*Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TEFGMAIGPENSGKWLTAEVSGGSRGGRIFRSSDFAKNFVQTD '" 

L P FH P LTQ MM YS PQNS D YLLAL S TENG LWVS KN FGG KWE E I H KA 

VCLAKWGSDNTIFFTTYANGSCKADLGALELWRTSDLGKSFKTI 

GVKIYSFGLGGRFLFASVMADKDTTRRIHVSTDQGDTWSMAQLP 

SVGQEQFYSILAANDDMVFMHVDEPGDTGFGTIFTSDDRGIVYS 

KS LDRH L Y TTTGGE TD FTNVTS LRGV Y I TS VLS E DNS I QTM I TF 

DQGGR WTHLRKPENS ECDATAKNKNFO <5 T ■ w T HA <3 Y Q t qawt mwd 

MAPLSEPNAVGIVIAHGSVGDAISVMVPDVYISDDGGYSWTKML 

EGPHYYTILDSGGIIVAIEHSSRPINVIKFSTDEGQCWQTYTFT 

RDPIYFTGLASEPGARSMNISIWGFTESFLTSQWVSYTIDFKDI 

LERNCEEKDYTIWLAHSTDPEDYEDGCILGYKEQFLRLRKSSVC 

QNGRDYWTKQPSICLCSLEDFLCDFGYYRPENDSKCVEQPELK 

GHDLEFCLYGREEHLTTNGYRKIPGDKCQGGVNPVREVKDLKKK 

CTSNFLSPEKQNSKSNSVPIILAIVGLMLVTVVAGVLIVKKYVC 

GGRFLVHLYSVLQQH\AEA\NGVDGVDALDTASHTNKSGYHDDS 
DEDLLE 


5982 




2316 


ATR P PRGS S WCRQFS RTAS AAPGRSNMLR I P VRKALVGLS KS P K ' 

GCVRTTATAASNLIEVFVDGQSVMVEPGTTVLQACEKVGMQIPR 

FCYHERLSVAGNCRMCLVEIEKAPKWAACAMPVMKGVJNILTNS 

EKSKKAREGVMEFLTiANHPLDCPICDQGGECDLQDQSMMFGNDR 

SRFLEGKRAVEDKN1GPLVKTIMTRCIQCTRCIRFASEIAGVDD 

LGTTGRGNDMQVGTYIEKMFMSELSGNIIDICPVGALTSKPYAF 

TARPWETRKTESIDVMDAVGSNIWSTRTGEVMRILPRMHEDIN 

EEWISDKTRFAYDGLKRQRLTEPMVRNEKGLLTYTSWEDALSRV 

r>.\ji. i±j\4&L \£\j£\uvnnxn\3y3UVUi\£it\jj V/UjIUJLIjNK VD5DTLCTEE 

VFPTAGAGTDbRSNYLLNTTIAGVEEADWLLVGTNPRFEAPLF 

NARIRKSWLHNDLKVALIGSPVDLTYTYDHLGDSPKILQDIASG 

SHPFSQVLKEAKKPMWLGSSALQRNDGAAILAAVSSIAQKIRM 

TSGVTGDW KVMNI LHR IASQVAALDLG YKPGVEA IRKNP PKVLF 

LLGADGGCITRQDLPKDCFIIYQGHHGDVGAPIADVILPGAAYT 

E KS AT YVNTEGRAQQTKVAVTP PG LAR EDW K 1 1 RALS E I AGMTL 

P YDTL \ DQVRNR LEE VS PNLVRYDDI EG\ANYFQQANELS KLVN 

QQLLADPLVPPQLTMKDFYMTDSISRASQTMAKCVKAVTEGAQA 
VEEPSIC 


5983 


248 


1763 


E ARGDGG RRRHRAS GRRAGRG E P \ AG L KSQG QRA V~P KRAVARGrGr 
RQ\YSAAIALLEPAGSEIADDLSILYSNRAACYLKEGNCSGCIQ 

j-»v_i.^rv<-\jjiii_inf c oiMl\JrJjJjKlvHJ*lAi Ji Jl Lc»Q YCaKAY VDYKTVLQIDC 

GLQLANDSVNRLSRILMELDGPNWREKLSLIPAVPASVPLQAWH 
PAKEMISKQAGDSSSHRQQGITDEKTFKALKEEGNQCVNDKNYK 
DAL S K YS ECL K I NNKECA I YTNRALC YL KLCQFE E AKQD CDQ AL 
OLADGNVKAFYRRAl.AHKGTiK'MYnVQT.TnT kjv\ttj t ddcttdav 

MELEEVTRLLNLKDKTAPFNKEKERRKIEIQEVNEGKEEPGRPA 
GEVSTGCLASEKGGKSSRSPEDPEKLPIAKPNNAYEFGQIINAL 
S TRKDKE ACAHL LA I TAPKDL PMFLSNKLEGDTFLLL I QS LKNN 
L IEKD PS LVYQHLLYLSKAER FKMMLTL I S KGQKELI EQLFEDL 
SDTPNNHFTLEDIQALKRQYEL 


5984 


755 


1193 


SSVCMACTYVSNLGKKQRSVSFLASGLMRVSTGPELRLHHSFVL" 
TGDVGRR I CRLLVGLFTKGDTSS KRVHPFS PGPCFLLCDLAR VG 
SSPKINVSPFYQN\QTSTQRSCTVFVWQRCSLVGPFQVTVFTMY 
FHHSLRS I SRFSSG 


5985 


22 


1408 j 


RRVARPGTAEPAKARRTVRRGRARRDLAGAERKAGVSERGDSGR 
RRPNPSIPSAAAGMSHIQIPPGLTELLQGYTVEVLRQQPPDLVE 
FAVEYFTRLREARAPASVLPAATPRQSLGHPPPEPGPDRVTVDAX 
GDSESEEDEDLEVPVPSRFNRRVSVCAETYNPDEEEEDTDPRVI 
HPKTDEQRCRLQEACKD ILLFKNLDQEQLSQ VLDAM FER I VKAD 
EHVIDQGDDGDNFYVIERGTYDILVTKDNQTRSVGQYDNRGSFG 
ELALMYNTPRAATIVATSEGSLWGLDRVTFRRIIVKNNAKKRKM 
FESFIESVPLLKSLEVSERMKIVDVIGEKIYXR/DGERIITQGE 
K\ ADSFYI I ESGEVS I LI RSRTKSNKDGGNQE VEXARCHKGQ YF 
G E LALVTN K P RAAS A YAVGDVKCLVMD VQAFE RLLG P CMDI M KR 
N I SH YEEQL VKM FGS S VDLGNLGQ 



419 



WO 01/53312 PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 

ti£»cH nrH ncr 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

mi/->1 e^ftt* 1 rid 

nucicuL me 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide ' 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline / Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X~Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possdble nucleotide insertion) 


! 5986 


1806 


484 


DAWKSTSIiTPHWKLWGRHRGRRRGLAHPKNHLSPQQGGATPQVP"'" 
SPCCRFDSPRGPPPPRLGLLGALMAEDGVRGSPPVPSGPPMEED 
GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
LISNVCSIGDHVAQE1,FQGSDLGMAEEAERPGEK\AGQHSPLRE 
EHVTCVQS I LDEFLQT \ YGSL I PLSTDE WEKLED I FQQEFSTP 
S R KGL VLQL I QS YQRM PGNAMVRG FR VA Y KRH VLTMDD LGTL YG 
QNWLNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKLRTKGYDG 
VKR WTKNVD I FNKELLL I P IHLEVH WSLIS VD VRKRTI TYFDSQ 
RTLNR RC P KH I AKYLQ AE AVKKDRLD FHQG WKG YFKMNVARQNN 

DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQIYKELCHCKL 
TV 


5987 


1806 


484 


DAWKSTSLTFHWKLWGRHRGRRRGLAHPKNHLSPQQGGATPQVP 
S PCCRFDS PRGPPPPRLGLLGALMAEDGVRGS P PVPSGP PMEED 
GLRWTPKSPLDPDSGLLSCTLPNGFGGQSGPEGERSLAPPDASI 
LISNVCSIGDHVAQELFQGSDLGMAEEAERPGEK\AGQHSPLRE 
EHVTCVQS I LDEFLQT\YGSLIPLSTDEWEKLEDIFQQEFSTP 
SRKGLVLQLIQSYQRMPGNAMVRGFRVAYKRHVLTMDDLGTLYG 
QNWLNDQVMNMYGDLVMDTVPEK\VHFFNSFFY\DKLRTKGYDG 
VKRWTKJSTVDIFNKELLLIPIHLEVHWSLISVDVRRRTI TYFDSQ 
RTLNRRCPKH I AKYLQAEAVKKDRLD FHQG WKG YFKMNVARQNN 

DSDCGAFVLQYCKHLALSQPFSFTQQDMPKLRRQIYKELCHCKL 
TV 


5988 


1292 


410 


FKKYFLS FLGLLE S SHS RDR I HNLVLM FLLATHNLVWW FTCRFQ 
RLDCIYLNAGIMPNPQLNIKALLFGLFS\AEGLLTQGDKITADG 
LQEVFETDVFGHFILIRELEPLLCHSDNPSQLIWTSSRNARKSN 
FSLEDFQHSKGKEPYSSSKYATDLLSVALNRNFNQQGLYSNVAC 
PGTALTNLTYGILPPFIWTLLMPAILLLRFFANAFTLTPYNGTE 
ALVWLFHQKPESLNPLIKYLSATTGFGRNYIMTQKMDLDEDTAE 
KFYQKLLELEKHIRVTIQKTDNQARLSGSCL 


! 5989 


194 


2610 


AMDFPQHSQHVLEQLNQQRQLGLLCDCTFVVDGVHFKAHKAVLA 
ACSEYFKMLFVDQKDWHLDISNAAGLGQVLBFMYTAKLSLSPE 
NVDDVL \ AVATFLQMQDI I TACHALKS LAE PATS PGGNAEALAT 
EGGDKRAKEEKVATSTLSRLEQAGRSTPIGPSRDLKEERGGQAQ 
SAASGAEQTEKADAPREPPPVELKPDPTSGMAAAEAEAALSESS 
EQEMEVEPARKGEEEQKEQEEQEEEGAGPAEVKEEGSQLENGEA 
PEENENEESAGTDSGQELGSEARGLRSGTYGDRTESKAYGSVIH 
KCEDCG KE FTHTGNFKRH I R IHTGEKP FS CRECS KAFS DPAACK 
AHEKTHS PLKPYGCEECGKS YRLI SLLNLRKKRHSGEARYRCED 
CGKLFTTSGNLKRHQLVHSGEKPYQCDYCGRSFSDPTSKMRHLE 
THDTDKEHKCPHCDKKFNQVGNLKAHLKIHIADGPLKCRECGKQ 
FTTS GN LKRHLRIHSGE KP YVC I HCQRQ FADPGALQRHVR I HTG 

EKPCQCVMCGKAFTQASSLIAHVRQHTGEKPYVCERCGKRFVQS 
S QLANH I RHHDN I R PHKCSVCS KAFVNVGDLS KH 1 1 IHTGE KP Y 
L C DKCGRG FNR VDNLRSHVKT VHQ GKAG IKILEPEEGS E VS WT 
VDDMVTLATEALAATAVTQLTVVP VGAAVTADETE VLKAE I SKA 
VKQVQEEDPNTH I LYACDSCGDKFLDANSLAQHVR IHTAQALVM 
FQTDADFYQQYGPGGTWPAGQVLQAGELVFRPRDGAEGQPALAE 
TSPTAPECPPPAE 


5990 


2 


4700 


FGPGPDSGGGARGSGWGSRSQAPYGTLGAVSGGEQVLLHEEAGD 
SG F VSLSRLG PS LRDKDLEMEELMLQDETLLGTMQS YMDASL I S 
LIEDFGSLGEVEMSLPDPSWDFSPPSFLETSSPKLPSWRPPRSR 
PRWGQS PPPQQRSDGEEEEEVAS FSGQILAGELDNCVSS I PDFP 
ririLtf\^i? CjCiZjUJ\j\ii\f\E l FLftV rAftGDES ISSLSELVRAMHPYCLPN 
LTHLAS LEDELQEQPDDLTLPEGCWLE I VGQAATAGDDLE I PV 
WRQVSPGPRPVLLDDSLETSSALQLIiMPTLESETEAAVPKVTL 
CSEKEGLSLNSEEKLDSACLLKPREWEPWPKEPQNPPANAAP 
GSQRARKGRKKKSKEQPAACVEGYARRLRSSSRGQSTVGTEVTS 
QVDNLQKQPQEELQKESGPLQGKGKPRAWARAWAAALENSSPKN 
LERS AGQ S S P AKEGPLDL YP KLADT I QTN P I P THL S L VDS AQAS 
PMPVDSVEADPTAVGPVLAGPVPVDPGLVDLASTSSELVEPLPA 
EPVLINPVLADSAAVDPAWPISDNLPPVDAVPSGPAPVDLALV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=3 Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DPVPNDLTPVDPVLVKSRPTDPRRGAVSSALGGSAPQLLVESES 
LDPPKTIIPEVKEWDSLKIESGTSATTHEARPRPLSLSEYRRR 
RQQRQAETE ERS PQP PTGKWPSL PETPTGLAD I PCLVI PPAPAK 
KTALQRSPETPLEICLVPVGPSPASPSPEPPVSKPVASSPTEQV 
PSQEMPLLARPSPPVQSVSPAVPTPPSMSAALPFPAGGLGMPPS 
LPPPPLQPPSLPLSMGPVLPDPFTHYAPLPSWPCYPHVSPSGYP 
CLPPPPTVPLVSGTPGAYAVPPTCSVPWAPPPAPVSPYSSTCTY 
GPLGWGPGPQHAPFWSTVPPPPLPPASIGRAVPQPKMESRGTPA 
GPPENVLPLSMAPPLSLGIiPGHGAPQTEPTKVEVKPVPASPHPK 
HKVS ALVQS PQM KALACVSAEGVT VEEPASERLKPETQETRPRE 
KPPLPATKAVPTPRQSTVPKLPAVHPARLRKLSFLPTPRTQGSE 
DWQAFISEIGIEASDLSSLLEQFEKSEAKKECPPPAPADSLAV 
GNSGGVDIPQEKRPLDRLQAPELANVAGLTPPATPPHQLWKPLA 
AVSLLAKAKSPKSTAQEGTLKPEGVTEAKHPAAVRLQEGVHGPS 
RVHVGSGDHDYC\VRSRTPPKK\MPALLIPEVGSRWNVKRHQDI 
TIKPVLSLGPAAPPPPCIAASREPLDHRTSSEQADPSAPCLAPS 
SLLSPEASPCRNDMNTRTPPEPSAKQRSMRCYRKACRSASPSSQ 
GWQGRRGRNSRSVSSGSNRTSEASSSSSSSSSSSRSRSRSLSPP 
HKRWRRSSCSSSGRSRRCSSSSSSSSSSSSSSSSSSSSRSRSRS 
PS PRRRSDRRRR YSS YRSHDHYQRQRVLQKERAI EERR WFIGK 
IPGRMTRSELKQRFSVFGEIEECTIHFRVQGDNYGFVTYRYAEE 
AFAAIESGHKLRQADEQPFDLCFGGRRQFCKRSYSDLDSNREDF 
DPAPVKSKFDSLDFDTLLKQAQKNLRR 


5991 


334 


1379 


RLSSHFSQCSPSIYCXTKFDKQGNVTSFERKKTELYQELGLQAR 
DLRFQHVMSITVRNNRIIMRMEYLKAVITPECLLILDYRNLNLK 
QWLFRELPSQLSGEGQLVTYPLPFEFRAIEALLQYWINTLQGKL 
SILQPLILETLDALGDPKHSSVDRSKLHILLQNGKSLSELETDI 
. KIFKESILE I LDEEELLEELCVS KWSDPQVFEKSSAG I DHAEEM 
ELLLENYYRLADDLSNAARELRVL IDDSQS I I F INLDSHRNVMM 
RLNLQLTMGTFSLSLFGLMGVAFGMNLESSLEEDHRIFWLITGI 
MFMGSGLIWRRLLSFLGR/LARSSIASYGMKDMVHGGIVEGL 


5992 


2 


609 


AGPDFRLVCGVSGSGFPGGRQGQATEWRPLRPWNGAMEKLRRVL 
SGQDDEEQGLTAQDSQINL/SEVLDASSLSFNTRLKWFAICFVC 
GVFFS ILGTGLLWLPGGI KLFAVFYTLGNLAALASTCFLMGPVK 
QLKKMFEATRLLATI VMLLCFI FTLCAALWWHKKGLAVLFCI LQ 
FLSMTWYSLS YI PYARDAVI KCCSSLLS 


5993 


1650 


594 


AEGLGS WAVWAGLGWAGRHMEAGGATGALGVGCKliPSAFC F PGS 
SVAMDMFQKVEKIGEGTYGWYKAKNRETGQLVALKKIRLDLEM 
EG VPS TA IREISLLKELKHPN I VRLLD WHNER KL YLVFE FLSQ 
DLKKYMDSTPGSELPLHLIKSYTjFQLLQGVSFCHSHRVIHRDLK 
PQNLLI^LGAIKLADFGLARAFGVPLRTYTHEVVTIiWYRAPEI 
LLATRFYTTAVDIWSIGCIFAEMVTRKALFPGDS\EIDQ\LFRI 
FRMLGTPSEDTWPGVTQLPDYKGSFPKWTRKGLEEIVPNLEPEG 
RD LLMQLLQ YDPS Q R I TAKTALAHP Y FS S PE P S PAARQ YVLQRF 
RH 


5994 


394 


1934 


AGEVQLHVWIRGMRIQPQ/KAAAIIDLDPDFEPQSRPRSCTWPL 
PRPEIANQPSKPPEVEPDLGEKVHTEGRSEPILLPSRLPEPAGG 
PQPG I LGAVTG PRKGGS RRNAWGNQS YAEL I S QAI ES APE KRLT 
LAQIYEWMVRTVPYFKDKGDSNSSAGWKNSIRHNLSLHSKFIKV 
HNEATGKSSWWMLNPEGGKSGKAPRRRAASMDSSSKLLRGRSKA 
PKKKPSGLPAP PEGATPTS PVGHFAKWSGSPCSRNREEADMWTT 
FRPRSSSNASSVSTRLSPLRPESEVLAEEIPASVSSYAGGVPPT 
LNEGLELLDGLNLTSSHSLLSRSGLSGFSLQHPGVTGPLHTYSS 
SLFS PAEGPLSAGEGCFSSSQALEALLTS DTPP P PADVLMTQVD 
PILSQAPTLLLLGGLPSSSKLATGVGLCPKPLEAPGPSSLVPTL 
SMIAPPPVMASAPIPKALGTPVLTPPTEAASQDRMPQDLDLDMY 
MENLECDMDNIISDLMDEGEGLDFNFEPDP 


5995 '■ 


2 


2437 


RPPGPGPASGAWLCTRARGSAAFVPPLPRPPSRGARRRRRLPGR 
GVAALRRGPGSAPGLPRGRAERSAAGSGRGPSREERGAAAAAAA 
AEMMEELHSL\DP\RRQEXiLEARF\TGLGVSKGPLNSESSNQSL 
CSVGSLSDKEVETPEKKQNDQRNRKRKAEPYETSQGKGTPRGHK 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rl CU1L UCll cnu 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*=Alanine, C=Cysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=* Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K= Lysine, 
L^Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V^valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ISDYFERRVEQPLYGLDGSAAKEATEEQSALPTLMSVMLAKPRL 
DTEQLAQRGAGLCFTFVSAQQNSPSSTGSGNTEHSCSSQKQISI 
QHRQT\QSDLTIEKI SALENS KNSDLEKKEGRIDDLLRANCDLR 
RQ I \DEQQKMLEKYK\ ERLNRCFDNEPRNFLI EKS KQEKMACRD 
KS MQDRLRLGHFTTVRHGAS FTEQWTDG YAFQNL I KQQERINS Q 
REE I ERQRKMLAKRKPPAMGQAP PATNEQKQRKSKTNGAENETL 
TLAEYHEQEEIFKLRLGHLKKEEAEIQAELERLERVRNLHIREL 
KRIHNEDNSQFKDHPTLNDRYLLLHLLGRGGFSEVYKAFDLTEQ 
RYVAVKIHQLNKNWRDEKKENYHKHACREYRIHKELDHPRIVKL 
YDYFSLDTDSFCTVLEYCEGNDLDFYLKQHKLMSEKEARSIIMQ 
J. v iMAi; a. i LiiN b 1 fttf tr I In i UbKlrGIJI JjLVNGTACGEIKITDFGLS 
KIMDDDSYNSVDGMELTSQGAGTYWYLPPECFWGKEPPKISNK 
VDVWSVGVIFYQCLYGRKPFGHNQSQQDILQENTILKATEVQFP 
PKPWTPEAKAFIRRCLAYRKEDRIDVQQLACDPYLLPHIRKSV 
STSSPAGAAIASTSGASNNSSSN 




1^12 " 


981 


L>y yAi-ijijVjljrlJb I Lib b U 1 Lib c u Fb W 1 GS W I QR/ S W VSWRSRPGCE 
LFS I WFGS IVNEGYLNSASEGEEFCI YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWFTGDS C YL \ ANQWQVS KPKDNPLNEGTDAS PGR PSPFS 
FFS I FTWS LTAALAVRR FKDLS FQEE YSTLFP \ ASAQP 


5997 


1612 


9B1 


*-» V v>iULfcLAj4jnij 1 Lib £ vjr j. Lib t jj Fi> W 1 t»i> W i y R / S W VSWRSRPGCE 
LFS I WFGS IVNEGYLNSASEGEEFCI YNRNPNACS YGVAVGVL 
A FLTCLL YLALD VY F PQ I S S VKDRKK\ AVLSGH P WS G E PHPAA 
FWAFLWFTGDSCYL\ ANQWQVS KPKDNPLNEGTDAS PGR PSPFS 
FFSIFTWSLTAALAVRRFKDLSFQEE YSTLFP \ASAQP 


5998 


1612 


981 


DQQACLLGLMLTLEFGILEFDPSWIGSWTQR/SWVSWRSRPGCE 
LFS I WFGS IVNEGYLNSASEGEEFCI YNRNPNACS YGVAVGVL 
AFLTCLLYLALDVYFPQISSVKDRKK\AVLSGHPWSGEPHPAA 
FWAFLWFTGDSCYL\ ANQWQVS KPKDNPLNEGTDAS PGR PSPFS 
FFSIFTWSLTAALAVRRFKDLSFQEEYSTLPP\ASAQP 


5999 


2 


1790 
I 


RPPMEKARRGGDGVPRGPVLHIWVGFHHKKGCQVEFSYPPLIP 
GDGHDSHTLPEEWKYLPFLALPDGAHNYQEDTVFFHLPPRNGNG 
ATVFGISCYR\QIEAKALKVRQADITRETVQKSVCVLSKLPLYG 
LLQAKLQL I THA Y FEE KDFS Q I S I LKEL YEHMNS S LGGAS LEGS 
QVYLGLS PRDLVLHFRHKGL I LFKL I LLEKKVLFY I S PVNKLVG 
ALMTVLSLFPGMIEHGLSDCSQYRPRKSMSEDGGLQESNPCADD 
F VS AS TAD VSHTNLGT I R KVMAGNHGEDAAM KTEE PL FQ VED SS 
KGQEPNDTNQYLKPPSRPSPDSSESDWETLDPSVLEDPNLKERE 
QLGSDQTNLFPKDSVPSESLPITVQPQANTGQWLIPGLISGLE 
EDQYGMPLAIFTKGYLCLPYMALQQHHLLSDVTVRGFVAGATNI 
LFRQQKHLSDAI VEVEEALIQ I HDPELRKLLNPTTADLRFAD YL 
VRHVTENRDDVFLDGTG WEGGDEW I RAQFAVY IHALLAATLQLV 
LFRI VNVAKKI GNVMVTT\SRNWQTGK\AVGQSVGGAFS \SAK 
TA\MSSWLSTFTTSTSQSLTEPPDEKP 


6000 


101 


1561 


TEPCRTAENCTATMSENNKNSLESSLRQLKCHFTWNLMEGENSL 
DDFEDKVFYRTEFQNREFKATMCNLLAYLKHLKGQNEAALECLR 
KAbbLi x LiLJbrlALJyAb IRS LVTWGNYAWVY YHMGRLS DVQ I YVDK 
VKHVCEKFSSPYRIESPELDCEEGWTRLKCGGNQNERAKVCFEK 
ALEKKPKNPEFTSGLAIASYRLDNWPPSQNAIDPLRQAIRLNPD 
NQYLKVLLALKLHKMREEGEEEGEGEK\LVEEALEKAPG\VTDV 
LRSAA\ KFYRGKDE PDKAI ELiLKKAIjE Y I P\NMAYT,Hrrnnrrv 

RAKVFQVMNLRENGMYGKRKLLELIGHAVAHLKKADEANDNLFR 
VCSILASLHALADQYEDAEYYFQKEFSKELTPVAKQLLHLRYGN 
FQLYQMKCEDKAIHHFIEGVKINQKSREKEKMKDKLQKIAKMRL 
S KNGADS EALHVLAFLQELNEKMQQADEDSERGLESGS LI PSAS 
SWNGE 


6001 


176 


103B 


AFAHSPSRGHRHTHIHTPRHTPRCTMABSHLQSSLITASQFFEI 
WLHFDADGSGYLEGKELQNLIQELQQARKKAGLELSPEMKTFVD 
QYGQRDDGKIGIVELAHVLPTEENFLLLFRCQQLKSCE\EFMKT 
WRKYDTDHSGFIETEELKNFLKDLLEKANKTVDDTKLAEYTDLM 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid SPCfment* rnnl-ai nn nrr ci«-in=il nortt-i J~ 
w t -* v * j - k -* s3t.yiiit-.ixi- wui i Lain x i iy blMiidl pcpuXuS 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=»Lysine, 
L=Leucine , M«Me thionine , N=Asparagine , 
P=»Proline, Q^Glutamine, R=Arginine, 
S^Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknovm, *=stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LKLFDSNNDGKLELTEMARLLPVQENFLLKPQG I KMCGKEPNKA 
FELYDQDGNGYIDENELDALLKDLCEKNKQDLDINNITTYKKNI 
MALS DGG KL YRTDLAL I LCAGDN 


6002 


977 


81 


IiAPPGGGLHIPPRTPLSHSRPPPSHHAPHPSPLPLPPADLHPHS 
S MAQ RSDLL ELDCQLTRDR VWVS HDENL CRQSGLNRD VGS LDF 
EDLPLYKEKLEVYFSPGHFAHGSDRRMVRLEDLFQRFPRTPMSV 
EIKGKNEELIREQ/VLVRRYDRNEITIWASEKSSVMICKCKAANP 
EMPLSFTISRGFWVLLSYYLGLLPFIPIPEKFFFCFLPNIINRT 
Y F P FS CS CLNQ LLA WS KWL I M R KS L I RHLE E RGVQ WFWCLNE 
ESDFEAAFSVGATGVITDYPTALRHYLDNHGPAARTS 


6003 


140 


4098 


GKLRAFRGMRRLICKRICDYKSFDDEESVDGNRPSSAASAFKVP 
APKTSGNP ANSARKPG S AGG P KVGAGAS KEGGAGAVDEDDF I KA 
FTDVPSIQIYSSRELEETLNKIREILSDDKHDWDQRANALKKIR 
SLLVAGAAQYDCFFQHLRLLDGALKLSAKDLRSQWREACITVA 
K L S TVLGNKFDHG AE A I V P TL FNLV PNS AKVMATSG CAA IRF 1 1 
RHTHVPRLIPLITSNCTSKSVPVRRRSFEFLDLLLQEWQTHSLE 
RHAAVLVETIKKGIHDADAEARVEARKTYMGLRNHFPGEAETLY 
NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KWSTANPSTVAGRVSAGSSKASSLPGSLQRSRSDIDVNAAAGAK 
AHHAAGQSVRSGRLGAGALNAGSYASLEDTSDKLDGTASEDGRV 
RAKLSAPLAGMGNAKADSRGRSRTKMVSQSQPGSRSGSPGRVLT 
TTALSTVSSGVQRVLVNSASAQKRSKIPRSQGCSREASPSRLSV 
ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 
TGAL YAPEVYGASGPG YG I SQSSRLS SSVSAMRVLNTGS DVEEA 
VADALLLGDIRTKKKPARRRYESYGMHSDDDANSDASSACSERS 
YSSRNGSIPTYMRQT\EDV\AEVIjNRCASSNWSERKEGLLGLQN 
LLKNQRTLSRVEIiKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 
QVHKDDLQDWLFVLLTQLLKKMGADLLGSVQAKVQKALDVTRES 
FPNDLQFNILMRFTVDQTQTPSLKVKVAILKYIETLAKQMDPGD 
FINSSETRLAVSRVITWTTEPKSSDVRKAAQSVLISLFELNTPE 
FTMLLGALPKTFQDGATKLLHNHLRNTGNGTQSSMGSPLTRPTP 
RSPANWSSPLTS PTWTSONTTi c ?P^APnvryppwMMCi?nTVQCT ot~> 

VTEAIQNFSFRSQEDMNEPLKRDSKKDDGDSMCGGPG\MSDPRA 
GGDATDSSQTAL\DNKASLLHSMPTHSSPRSRDYNPYNYSDSIS 
PFlWSALKEAMFDDDADQFPDDLSLDHSDLVAELliKELSNHNER 
VEERKIALYELMKLTQEESFSVWDEHFKTILLLLLETLGDKEPT 
IRAIiALKVLREILRHQPARFKNYAELTVMKTLEAHIOPHKEVVR 
SAEEAASV\LATS I \SPEQCIKVLCPI IQTADYPINLAAIKMQT 
KVIERVSKETLNLLLPEIMPGLIQGYDNSESSVRKACVFCLVAV 
HAVIGDE LKPHLSQLTGSKMKLLNLY I KRAQTGSGGADPTTDVS 
GQS 


6004 


140 


4098 


GKLRAFRGMRRLICKRICDYKSFDDEES VDGNRPSSAASAFKVP 
APKTSGNPANSARKPGSAGGPK VGAGAS KEGGAGAVDEDDF I KA 
FTDVPSIQIYSSRELEETLNKIREILSDDKHDWDQRANALKKIR 
SLLVAGAAQYDCFFQHLRLLDGALKLSAKDLRSQWREACITVA 
HL S TVLGNKFDHGAE A I VP TLFNLVPNSAKVMAT S G CAA I R F 1 1 
RHTHVPRLI PLITSNCTSKSVPVRRRS FEFIiDLLLQEWQTHSLE 
RHAAVLVET I KKG I H D ADAEAR VEAR KT YMGLRNH F PG EAETL Y 
NSLEPSYQKSLQTYLKSSGSVASLPQSDRSSSSSQESLNRPFSS 
KWSTANPSTVAGRVSAGSSKASSLPGSLQRSRSDIDVNAAAGAK 
AHHAAGQS VRSGRLGAGALNAGS YASLEDTSDKLDGTAS EDGRV 
RAKLSAPIxAGMGNAKADSRGRSRTKMVSQSQPGSRSGSPGRVLT 
TTALSTVSSGVQRVLVNSASAQKRSKIPRSQGCSREASPSRLSV 
ARSSRIPRPSVSQGCSREASRESSRDTSPVRSFQPLASRHHSRS 
TGAL YAP E VYG AS G PG YG I SQ S S RLS S S VS AMRVLNTGS D VE EA 
VADALLLGD I RTKKKPARRR YES YGMHSDDDANSDAS SACSERS 
YSSRNGSIPTYMRQT\EDV\ABVLNRCASSNWSERKEGLLGLQN 
LLKNQRTLSRVELKRLCEIFTRMFADPHGKRVFSMFLETLVDFI 
QWKI)DU3DWLFVLLTQLLKKMGADLLGSVQAKVQKALDVTRES 
FPNDLQFNI LMRFTVDQTQTPS LKVKVAI LKYI ETLAKQMDPGD 
F INS S ETRLAVS R VI T WTTE P KS S D VRKAAQS VL I S L FE LNTPE 
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1 SEQ 
I ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
correspond i ng 
to first 
amino acid 
residue of 
amino acid 
sequence 


•"inxiiw ai-Au acymciiu tontaimny signal pepciae 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, 1= Is ©leucine. , K«Lysine, 
L=Leucine, M=Methionine, N»Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=UnJcnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FTMLLGALPKTFQDGATKLLHNHLfeNTGNGTQSSMGSPIiTRPTP" 
RSPANWSSPLTSPTNTSObrTLSPSAPDYnTPMMMQPnTVQQT i?n 

VTEAIQNFS FRSQEDMNEPLKRDS KKDDGDSMCX3GPG \MSDPRA 
GGDATDSSQTAL\DNKASLLHSMPTHS S PRSRDYNP YNYSDS I S 
PFNKSALKEAMFDDDADQFPDDLSLDHSDLVAELLKELSNHNER 
VEERKIALYELMKLTQEESFSVWDEHFKTILLLLLETLGDKEPT 
IRALALKVLREILRHQPARFKNYAELTVMKTLEAHKDPHKEWR 
SAEEAASV\LATSI\SPEQCIKVIiCPIIQTADYPINLAAIKMQT 
KVIERVSKETLNLLLPEIMPGLIQGYDNSESSVRKACVFCLVAV 
HAVIGDELKPHLSQLTGSKMKLLNLYIKRAQTGSGGADPTTDVS 
GQS 


6005 


133 


5955 


RSSGRRQEQLGQFPGRERKGMASGLGSPSPCSAGSEEEDMDALlT" 

NNSLPPPHPENEEDPEEDLSETETPKLKKKKKPKKPRDPKIPKS 

KRQ KKERMLLCRQLG DS SG EG P E FVE E E E E VALRSD S EG S D Y T P 

GKKKKKKLQPKKEKKSKSKRKEEEEEDDDDDDDSKEPKSSAQLL 

EDWGMEDIDHVFSEEDYRTLTNYKAFSQFVRPLIAAKNPKIAVS 

KMMMVLGAKWRE FSTNNPFKG S SGAS VAAAAAAAVAWESM VTA 

TEVAPPPPPVEVPIRKAKTKEGKGPNARRKPKGSPRVPDAKKPK 

P KKVAP L K I KLGG FGSKRKRSSS E DDDLD VE SDFDDAS INS YS V 

SDGSTSRSSRSRKKLRTTKKKKKGEEEVTAVDGYETDHQDYCEV 

CQQGGEIILCDTCPRAYHMVCLDPDMEKAPEGKWSCPHCEKEGI 

QWEAKEDNSEGEEILEEVGGDLEEEDDHHMEFCRVCKDGGELLC 

CDTCPSSYHIHCLNPPLPEIPNGEWLCPRCTCPALKGKVQKILI 

WKWGQP PS PTP VPRP PDADPNTPS PKPLEGRPERQFFVKWQGMS 

YWHCSWVSELQLELHC\QVMFRNYQRKNDMDEPPSGDFGGDEEK 

S\RKRKNKDPKFAEMEERFYRYGIKPEW\MMIHRILNHSVDKKG 

HVHYLIKWRDLPYDQASWESEDVEIQDYDLFKQSYWNHRELMRG 

EEGRPGKKLKKVKLRKLERPPETPTVDPTVKYERQPEYLDATGG 

TLHPYQMEGLNWLRFSWAQGTDTIIiADEMGLGKTVQTAVFLYSL 

YKEGHS KGP FLVS APLST I IN\ WERE FEMW APDM YV\VTY VGDK 

DSRAI I RENE FS \ FEDNA I RGG KKAS RMKKEAS VKFHVLLTS YE 

L I T I DMA I LG S I DW ACL I VDE AHRL KNNQS K F FR VLNG YS LQHK 

LLL TG T PLQNNLEELFHL LNFLTP ER FHNLEG FLE E FAD I AKE D 

QIKKLHDMLG\PHMLRRLKADVFKNMPSKTELIV\RVELSPM\Q 

KKYYK\YILHSKFIjKALN\ARGGGNQVSLLNWMDLKKCCNHPY 

LFPVAAMEAPKMPNGMYDGSALIRASGKLLLLQKMLKNLKEGGH 

RVL I FSQMTKMLDLLED FLEHEGYKYERIDGG I TGNMRQEAI DR 

FNAPGAQQFCFLLSTRAGGLG INLATADTVI I YDSDWNPHND I Q 

AF S RAHR I GQNKKVM I YR F VTRAS VEER I TQVAKKKMMLTHLW 

RPGLGS KTGSMS KQELDD I LKFGTEELFKDEATDGGGDNKEGED 

SSVIHYDDKAIERLLDRNQDETEDTELQGMNEYLSSFKVAQYW 

REEEMGEEEEVEREIIKQEESVDPDYWEKLLRHHYEQQQEDLAR 

NLGKGKRIRKQVNYNDGSQEDRDWQDDQSDNQSDYSVASEEGDE 

DFDERSEAPRRPSRKGLRNDKDKPLPPLLARVGGNIEVLGFNAR 

QRKAFLNAIMRYGMPPQDAFTTQWLVRDLRGKSEKEFKAYVSLF 

MRHLCEPGADGAETFADGVPREGLSRQHVLTRIGVMSLIRKKVQ 

c» c Ctn v iv »j k n & i*i f ttUWL V h. h W KKJVlby P(j b P b PKTP TP S TPGDTQP 

NTPAP VP PAEDG I KI E ENS LKE E ES I EGE KE VKS TAP E TA I E CT 

QAPAPASEDEKVWEPPEGEEKVEKAEVKERTEEPMETEPKGKG 

AADVEKVEEKSAIDLTPIWEDKEEKKEEEEKKEVMLQNGETPK 

DLNDEKQKKNIKQRFMFNIADGGFTELHSLWQNEERAATVTKKT 

YE I WKRRHDYWLLAG I INHGYARWQD IQNDPRYAI LNE PFKGEM 

NRGNFLEIKNKFLARRFKLLEQALVIEEQLRRAAYLNMSEDPSH 

PSMALNTRFAEVECLAESHQHLSKESMAGNKPANAVLHKVLKQL 

EELLSDMKADVTRL PATIAR I PPVAVRLQMSERNI LSRLANRAP 

EPTPQQVAQQQ 


6006 


1 


965 


DNDFLRNTVHRHE P PVTAE P I RLLAENE D VWVD KPSS I PVHP C 
GRFRHNTVIFILGKEHQLKELHPLHRLDRLTSGVLMFAKTAAVS 
ERIHEQVRDRQLEKEYVCRVEGEFPTEEVTCKEPILWSYKVGV 
CRVDPRGKPCETVFQRLSYNGQSSWRCRPLTGRTHQIRVHLQF 
LGH P I LND P I YNS VAWG PSRGRGG YI P KTNE ELLRDL VAEHQAK 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rJ.cun.LcU fc. Hd 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C=Cysteine, D»Aspartic Acid, E=, 
Glutamic Acid, F=. Phenylalanine, G=Glycine, 

HsHishidinf T"Tc rs~] o 1 1 r» no t wdi na 
ii-iij.oi.^uiijc f i-ibuicucins, j\=bysin6 , 

L=Leucine, M=Methionine , N-Asparagine, 

P=Proline, Q=Glutamine, R=Arginine, 

S=Serine, T=Threonine, V-Valine, 

W=Tryptophan, Y»Tyrooine, X=Unknown, *=Stop 

Codon, /=possible nucleotide deletion, 

\=»possible nucleotide insertion) 








qsldvldlcegdlspgltdstapsselgkddleelaaaa\qkme 

BVAEAAPQELDTIALASEKAVETDVMNQ\RQT\TLCRVPAGATG 
SLAPRPCDVPTCPTL 


6007 


; 3 


2351 


HELGQVEYVFTDKTGTLTENEMQFRECSINGMKYQEINGRLVPE 

gptpdssegnlsylsslshlnnlshlttsssfrtspenetelik 
ehdlffkavslchtvqinnvqtdctgdgpwqsnlapsqleyyas 
spdekalveaaarigivfignseetmevktlgklerykllhile 
fdsdrrrmsvivqapsgekllfakgaessilpkciggeiektri 
hvdefalkglrtlciayrkftskeyeeidkrifeartalqqr\e 
eklaavfqfiekdlillgatavedrlqdkvretiealrmagikv 
wvltgdkhetavsvslscghfhrtmnilelinqksdsecaeqlr 

QLARR I TEDHVI QHGLWDGTS LS LALR EHE KLFME VCRNCS AV 

LCCRMAPLQKAKVIRL ikis pekpi tlavgdgandvsm iqeahv 

GIGIMGKEGRQAARNSDYAIARFKFLSKLLFVHGHFYYIRIATL 
VQYFFYIOWCFITPQFLYQFYCLFSQQTLYDSVYLTLY\NICFT 
SLPILIYSLLEQHVDPHVLQNKPTLYRDISKNRLLSIKTFLYWT 
I LGFS HAFI F FFGS YLL IGKDTSLLGNGQM FGNWTFGTLVFTVM 
VITVTVKMALETHFWTWINHLVTWGSIIFYFVFSLFYGGILWPF 
LGSQNM YFVF IQLLSSGSAWFAI ILM WTCLFLDI I KKVFDRHL 
HPTSTEKAQLTETNAGIKCLDSMCCFPEGEAACASVGRMLERVI 
GRCSPTHISRSWSASDPFYTNDRSILTLSTMDSSTC 


6008 


4554 


1089 


AG VRRAG ARRG PGRAL P AG ATA VP P P S ARRRRR C P AP EHAG PAR " " 

ASRPSQETMFQLPVNNLGSLRKARKTVKKILSDIGLEYCKEHIE 

DFKQFEPNDFYLKNTTWEDVGLWDPSLTKNQDYRTKPFCCSACP 

FSSKFFSAYKSHFRNVHSEDFENRILLNCPYCTFNADKKTLETH 

IKIFHAPNASAPSSSLSTFKDKNKNDGLKPKQADSVEQAVYYCK 

KCTYRDPLYEIVRKHIYREHFQHVAAPYIAKAGEKSLNGAVPLG 

SNAREESSIHCKRCLFMPKSYEALVQHVIEDHERIGYQVTAMIG 

HTNVWPRSKPLMLIAPKPQDKKSMGLPPRIGSLASGNV\RSLP 

SQQMVNRLSIPKPNLNSTGVWMMSSVHLQQNNYGVKSVGQGYSV 

GQSMRLGLGGNAPVS I PQQSQSVKQLLPSGNGRS YGLGSEQRSQ 

APARYSLQSANASSLSSGQLKSPSLSQSQASRVLGQSSSKPAAA 

ATGPPPGNTSSTQKWKICTICNELFPENVYSVHFEKEHKAEKVP 

AVAMYIMKIHNFTSKCLYCNRYLPTDTLLNHMLIHGLSCPYCRS 

TFND VE KMAAHMRM VH I DE E MG P KTDS TLS FDLTLQQGS HTN I H 

LLVTTYNLRDAPAESVAYHAQNNPPVPPKPQPKVQEKADIPVKS 

S PQAAVP YKKD VGKTLCPLC FS I LKGPI SDAIAHHLRERHQVIQ 

TVH P VEKKLT YKC I HCLGVYTSNMTAS T ITLHLVHCRGVGKTQN 

uuutt.iWAFbKJjNU5PSIaAPVKRTYEQMEFPLLKKRKLDDDSDSP 

S FFE EKPEEP WLALDPKGH \ EDDS YEARKS FLTKYFT \ KQPYP 

TRREIEKLAASLWV\WK\SDIASHFSNKRKKCVRDCEKYKPGVL 

LGFNMKELNKVKHEMDFDAEGLFENHDEKDSRVNASKTADKKLN 

LGKEDDSSSDSFENLEEESNESGSPFDPVFEVEPKISNDNPEEH 

VLKVIPEDASESEEKLDQKEDGSKYETIHLTEEPTKLMHNASDS 

EVDQDDWEWKDGASPSESGPGSQQVSDFEDNTCEMKPGTWSDE 

SSQSE DAR SSK PAAKKKATMQGDR EQLKWKNS S YGKVEG F WS KD 

QS Q WKN AS ENDERLSNPQ I EWQNST I DSEDGEQFDNMTDGVAEP 

MHGSLAGVKLSSQQA 


6009 


4272 


1534 


CHGLQHLTPFRELNLSLQG*EPH+AA*QAVRSEEKSIC+GSPSC ' 
HLVLGVLVPVARQSSHSAGPAQSAFR*TGTGSGTPKAAEQSGYW 
EAYTLGHQHWNMFPIQRPPLVMKGRRIMCGKCEKG*VSDSVTGG 
RAVAGEOASORRTVFTAGGGErT.rJATfci\/T?ac:vTrTr2MnDr , T7Mr'T t 

NGKRGGCFESGYLFGFIVIGKIQSLEAKVPLPVNGQTGERASPG 
NCRIHIVDAVC*SEHH*DHFIiAAAFLENSTIIS*VAPGSWQDHA 
VLQKEVQASVRCRGFESVDTAPAGFWAHSPPGLQGEPTTTSVSL 
FVLAPQDGEGVP FVEGQLVTVLGLWPQS IRHTFVHHTQLFLHP 
I * KLGALDVAFLHLLTLVCSS FNVAYG *GKNGGTTLHQLFAEVN 
AVTRG S AVQRR P S I T I S S 1HVDTKIQQELHDVMVAGADGWQWG 
DPFWGLAGIFHLIDDPLHQIELSFQRRV*EQCQGVKPDSQPVP 
RPLRVGL.LQVGPLVRGGGRRVAGRGKRCWRDLLFPWRWGLSHRT 
RDLLRGGDRGHWVIVLCRLGSLVGGLGTDELLWFGGR*LIIIG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acia 
sequence 


Predxcted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCyeteine, D=Aspartic Acid, E= 
Glutamic Acid; F=Phenylalanine, G=Glycine, 
H=Histidine / I=Isoleucine, K-Lysine, 
L^Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








I * * RGRLSGEWGCGLGRGELFQVS IGI GVS I VH I GQGDHEVLGG 
AG L VE RGALHATGQG VEAL VQQLLD VGPAGALGLCDGAAL FQG P 
GRVGQLPAEGLQVCITLVAQWRMHDGRELGGAEWPWQALHGAAI 
CGVGGAILLKALSQYFLKGG*RLWCARGQ*PVKKRQRRWRG*TR 
R*NGLTIHCFN*LI*GAVCCRLVILRWCGLLEVHGVYGT*IHCL 
GSFPGRLWP*PFISQERPNGHCQWEFRLAVPSWKCRWSRWRVRG 
TWRYGNPLLNLL*GAWLGGAACGGQQGGPLSTWQACTGPGQAAF 
LP P FQGACR PRTQRCRTWVCP I AWRQLLAYTRD 


6010 


1 


3533 


IMPCGSSRLLRGCWTHPNEPVSDLSYFDCIESVMENSKVLGESM " 
AG I S QNAKTGD L P AFGE C VG I AS KAL CGLTEAAAQAAYLVG I FD 
PNS Q AG HQGL VD P IQFARANQA I QMACQNLVD PGS S PS Q VLS AA 
T I VAKHTSAL CNACR I AS S KTAN PVAKRH FVQ SAKE VANS TANL 
VKTI KALDGD F S E DNRNKCR I ATAP L I EAVENLTAFAS NPE F VS 
IPAQISSEGSQAQEPILVSAKPMLESSSYLIRTARSLAINPKDP 
PTWS VLAGHSHTVSDS IKSLITSI RDKAPGQRECDYS IDG I NRC 
IRDIEQASLAAVSQSLATRDDISVEALQEQLTSWQEIGHLIDP 
IATAARGEAAQLGHKGTQLASYFEPLILAAVGVASKILDHQQQM 
TVLDQTKTLAESALQMLYAAKEGGGNPKAQHTHDAITEAAQLMK 
EAVDD I M VTLNEAASE VGLVGGMVDA I AEAMS KLDEGT P PE P KG 
TFVDYQTTWKYSKAIAVTAQEMMTKSVTNPEELGGLASQMTSD 
YGHLAFQGQMAAAT AE P E E I G FQ I RTR VQDLGHGC I FL VQ KAG\ 
ALQVCPTDSYTKRELIECARAVTEKVSLVLSALQAGNKGTQACI 
TAATAVSG I I ADLDTT I MFATAGTLNAENSET FADHREN I LKTA 
KALVEDTKLLVSGAAST PDKLAQAAQS SAATI TQLAE WKLGAA 
SLGSDDPETQWLINAIKDVAKALSDLISATKGAASKPVDDPSM 
YQLKGAAKVMVTNVTSLLKT VKAVEDEATRGTRALEATI E CIKQ 
ELTVFQS KDVPE KTS S PEE S I RMTKG I TMATAKAVAAGNS CRQE 
DVIATANLSRKAVSDMLTACKQASFHPDVSDEVRTRALRFGTEC 
TLGYLDLLEHVLVILQKPTPELKQQLAAFSKRVAGAVTELIQAA 
EAMKGTE WVDPEDPTVIAE TELLGAAAS I EAAAKKLEQLKPRAK 
PKQADETLDFEEQ ILEAAKS I AAATSALVKSASAAQRELVAQGK 
VGS I PANAADDGQWSQGLI SAARMVAAATSSLCEAANASVQGHA 
S EEKL I S S AKQVAAS TAQLLVACKVKADQDSEAMRRLQAAGNAV 
KRASDNLVRAAQKAAFGKADDDD WVKTKFVGG I AQ 1 1 AAQEEM 
LKKERELEEARKKLAQIRQQQYKFLPTELREDEG 


6011 


446 


1835 


LLQPAMRKSPGLSDCLWAWILLLSTLTGRSYGQPSLQDELKDNT " 
TVFTR I L DRLLDG YDNRLRP GLGER VTE VKTD I F VTS FGPVSDH 
DMEYTI DVFFRQS WKDERLKFKG PMTVLRLNNLMAS KI WTPDTF 
FHNGKKS VAHNMTMPNKLLR I T EDGTLL Y TMRLT VR \ AEC PMAF 
GRDFPM\D \ AHACPLKFGS YAYTRAE WYEWTRE PARS VWAED 
GS RLNQYDLLGQTVDS GI VQS STGEYWMTTHFHLKRKIG Y FVI 
QTYLPCIMTVILSQVSFWLNRESVPARTVFGVTTVLTMTTLSIS 
ARNSL P KVA YATAMD W F I AVC YAFVFS AL I EFAT VNY FTKRG YA 
WDGKSWPEKPKKVKDPLIKKNNTYAPTATSYTPNLARGDPGLA 
TIAKSATIEPKEVKPETKPPEPKKTFNSVSKIDRLSRIAFPLLF 
G I FNLVYWATYLNRE PQLKAPTPHQ 


6012 


351 


5013 


PAELFQSFAIWHKELYDWRLGPWNQCQPVISKSLEKPLECIKGE 
EGIQVREIACIQKDKDIPAEDIICEYFEPKPLLEQACLIPCQQD 
CIVSEFSAWSECSKTCGSGLQHRTRHWAPPQFGGSGCPNLTEF 
Q VCQ S S P CE AE ELR YS LHVG P W S TCSM PHS RQ VRQARRRG KNKE 
REKDRSKGVKDPEARELIKKKRNRNRQNRQENKYWDIQIGYQTR 
EVMCINKTGKAADLSFCQQEKLPMTFQSCVITKECQVSEWSEWS 
PCS KTCHDM VS PAGTR VRTRT IRQFPIGSEKECPEFEEKEPCLS 
QGDG WP CAT YGWR TTE WTECR VDP LLS QQDKRRGNQTALCGGG 
I QTR E V YCVQ ANENLLS QLSTH KNKEAS KPMDLKL CTG P I PNTT 
QLCHIPCPTECEVSPWSAWGPCTYENCNDQQGKKGFKLRKRRIT 
NE P TGG SG VTGNC PHLLEA I P C EE PACYDW KAVRLGDCE PDNG K 
ECGPGTQVQEWCINSDGEEVDRQLCRDAI FP I PVACDAPCPKD 
CVLSTWSTWSSCSHTCSGKTTEGKQIRARSILAYAGEEGGIRCP 
NSSALQEVRSCNEHPCTVYHWQTGPWGQCIEDTSVSSFNTTTTW 
NGEASCSVGMQTRKVI CVRVNVGQVGPKKCPESLRPETVRPCLL 



426 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F- Phenyl alanine, G«Glycine, 
H^Histidine, I«Isoleucine, K=Lysine, 
Ii=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown , *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PCKKDCIVTPYSDWTSCPS\SCKEGDSSIRKQSRHRVIIQLPAN 
GGRDCTDPLYEEKACEAPQACQSYRW\KTHKW\HRCQ\LVP\WS 
VQQDSP\GAQEGCGPGRQARAITCRKQDGGQAGIHECLQYAGPV 
PALTQACQI PCQDDCQLTS WS KFSS CNGDCGAVRTRKRTLVGKS 
KKKEKCKNSHLYPLIETQYCPCDKYNAQPVGNWSDCILPEGKVE 
VLLGMKVQGD I KECGQG YR YQAMAC YDQNGRLVETSRCNSHG Y I 
EEACIIPCPSDCKLSEWSNWSRCSKSCGSGVKVRSKWLREKPYN 
GGRPCPKLDHVNQAQVYEWPCHSDCNQYLWVTEPWSICKVTFV 
NMRENCGEGVQTRKVRCMQNTADGPSEHVEDYLCDPEEMPLGSR 
VCKLPCPEDCVISEWGPWTQCVLPCNQSSFRQRSADPIRQPADE 
GRSCPNAVEKEPCNLNKNCYHYDYNVTDWSTCQLSEKAVCGNGI 
KTRMLDC VR S DG KS VD L KY CE ALG LE KNWQMNTS CMVE C P VNCQ 
LSDWSPWSECS QTCGLTG KM I RRRT VTQ P FQGDGR P C P S LMDQ S 
KP C P V KPC YRWQ YGQ WS P CQ VQE AQ CGE GTRTRNI S CWS DGS A 
DDFS KWDE E FCADI E L 1 1 DGNKNMVLE E S CSQPCPG D CYLKDW 
SSWSLCQLTCVNGEDLGFGGIQVRSRPVIIQELENQHLCPEQML 
ETKSCYDGQCYEYKWMASAWKGSSRTVWCQRSDGINVTGGCLVM 
SQPDADRSCNPPCSQPHSYCSETKTCHCEEGYTEVMSSNSTLEQ 
CTLIPVWLPTMEDKRGDVKTSRAVHPTQPSSNPAGRGRTWFLQ 
PFGPDGRLKTWVYGVAAGAFVLLI FI VSMI YLACKKPKKPQRRQ 
NNRLKPLTLAYDGDADM 


6013 


1161 


710 


GAFIAGVPVQPVLIRYPNSLDTTSWAWRGPGVLKVLWLTASQPC 
S I VDVE FLPVYHPSPEES RDPTL YANNVQRVMAQALG I PATECE 
FVGSLPVIWGRLKVALEPQL/WGTGKSASEGWAVRWLCGRWGR 
AR P E SNDQ PGR VCQAATAL 


6014 


2857 


613 


EA VAGGME KS RMNL P KG P D TLC FD KD E FMKED FD VDHF VS D CR K " ' 

RVQLEELRDDLELYYKLLKTAMVELINKDYADF\VNLSTNLVGM 

DKAI^NQLSVPLGQLREEVLSLRSSVSEGIRAVDERMSKQEDIRK 

KKMCVLRL I QVIRSVEKI E KILNS QSS KETSALEASS PLLTGQI 

LERIATEFNQLQFHACQSK\GMPLLDKVRPRIAGITAMLQQSLE 

GLLLEGLQTS DVD 1 1 RHCLRTYATI DKTRDAEALVGQVLVKP YI 

DEVIIEQFVESHPNGLQVMYNKLLEFVPHHCRLLREVTGGAISS 

EKGNTVPGYDFLVNSVWPQIVQGLEEKLPSLFNPGNPDAFHEKY 

TISMDFVRRLERQCGSQASVKRLRAHPAYHSFNKKWNLPVYFQI 

RFREIAGSLEAALTDVLEDAPAESPYCLLASHRTWSSLRRCWSD 

EMFLPLLVHRLWRLHSGRFWARYSVFV\N\ELSLRPISNESPKE 

IKKPL VTGSKE PS ITQGNTEDQGSGPSETKP WS I SRTQLVYW 

ADLDKLQEQLPELLEII KPKLEMIGFKNFSS ISAALEDSQSSFS 

ACVPS LSSK I IQDLSDS C FGFLKSALE VPRLYRRTNKE VPTTAS 

SYVDSALKPLFQLQSGHKDKLKQAIIQQWLEGTLSESTHKYYET 

VSDVLNSVKKMEESLKRLKQARKTTPANPVGPSGGMSDDDKIRL 

QLALDVE YLGEQ IQKLGLQASDIKS FSALAELVAAAKDQATAEQ 

P 


6015 

i 


13 


2237 


AEGCAERRGTEPWELSMSWESGAGPGLGSQGMDLVWSAWYGKC 
VKGKGS LPLS AHG I WAWLS R7VEWDQVTVYLFCDDHKLQR YALN 
RITWJRSRSGNELPLAVASTADLIRCKLLDVTGGLGTDELRLLY 
GMALVRFVNLISERKTKFAKVPLKCLAQEVNIPDWIVDLRHELT 
HKKMPHINDCRRGCYFVLDWLQKTYWCRQLENSLRETWELEEFR 
EGIEEEDQEEDKNIVVDDITEQKPEPQDDGKSTESDVKADGDSK 
GSEEVDSHCKKALSHKELYERARELLVSYEEEQFTVLEKFRYLP 
KA I KAWNNPS PR VE C VIAE LKGVTCENR EAVLD AFLDDG FLVP T 
PEOIiAALO I EYE ENVDLND VL.VPKP F <3 O FWf) PT .T.PfiT.HQ OM PTn 

ALIiERMLSELPALGISGIRPTYILRWTVELIVANTKTGRNARRF 
SAGQ WEARRGWRLFNCSASLDWPRM VES CLGS PCWAS PQLLR I I 
F\KAMGQGLQDE\EQEKLLRICSIYTQSGENSLVQEGSEASPIG 
KSPYTLDSLYWSVKPASSSFGSEAKAQQQEEQGSVNDVKEEEKE 
EKEVLPDQVEEEEENDDQEEEEEDEDDEDDEEEDRMEVGPFSTG 
QES PTAENARLLAQKRGALQGS AWQ VSS EDVRWDTFP\LGRMPR 
S RPRT P AE LMLEN YDTHV I FWTKPVL \ E QRLE PS TCK\TDTLG L 
\ S CG VGS \ GNCSNSS SSNFRGAFLLEARGSLH \ GL \ KTGLQIiF 


6016. 


13 


2237 


ASGCAERRGTEPWELSMSWESGAGPGLGSQGMDLVWSAWYGKC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, P« Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K-Lysine, 
L=Leucine, M=Methionine, N*Asparagine, 
P=Proline, Q»Glutamine, R=Arginine, 
S=Serine, T=Threonine, V*=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKGKGSLPLSAHGIWAWLSRAEWDQVTVYLFCDDHKLQRYALN 
RITVWRSRSGNELPLAVASTADLIRCKLLDVTGGLGTDELRLLY 
GMALVRFVNLISERKTKFAKVPLKCLAQEVNIPDWIVDLRHELT 
HK KMPH I NDCRRG C YFVLDWLQKT YW CRQLENS LRETWELE E FR 
EGIEEEDQEEDKNIWDDITEQKPEPQDDGKSTESDVKADGDSK 
GSEEVDSHCKKALSHKELYERARELLVSYEEEQFTVLEKFRYLP 
KAI KAWNNPS PRVECVLAELKGVTCENREAVLDAFLDDGFLVPT 
FEQLAALQIEYEENVDLNDVLVPKPFSQFWQPLLRGLHSQNFTQ 
ALL E RMLS E L P ALG I S G I RPTY I LRWTVE L I VANTKTGRNARRF 
SAGQWEARRGWRLFNCSASLDWPRMVESCLGSPCWASPQLLRII 
F \ KAMGQGLQDE \ EQE KLLR I CS I YTQSGENS LVQEGS E AS PIG 
KS P YTLDSLYWSVKPASSS FGSEAKAQQQEEQGSVNDVKEEEKE 
EKEVLPDQVEEEEENDDQEEEEEDEDDEDDEEEDRMEVGPFSTG 
Q E S P TAENARLLAQ KRGALQG S AWQ VSS E DVR WDT F P \ LGRM PR 
SRPRTPAELMLENYDTHVIFWTKPVL\EQRLEPSTCK\TDTLGL 
\ S CG VG S \ GNC S NS S S SN FRGAFLLEARG S LH \ GL \ KTGLQLF 


6017 


203 


3469 


SHQEIEQNSAMAPRKRGGRGISFIFCCFRNNDHPEITYRLRNDS 
N F ALQTME PAL P M P P VE E LD VM FS E L VDELDLTD KHRE AM F ALP 
AE KKWQIYCS KKKDQEENKGATS WPE FY I DQLNSMAARKSLLAL 
EKEEEEERSKTIESLKTALRTKPMRFVTRFIDLDGLSCILNFLK 
TMDYETSESR IHTSLIGCI KALMNNSQGRAHVLAHSES INVIAQ 
SLSTENIKTKVAVLEILGAVCLVPGGHKKVLQAMLHYQKYASER 
TRFQTLINDLDKSTGRYRDEVSLKTAIMSFINAVLSQGAGVESL 
DFRLHLRYE\FLMLGIHPVMDKLRKHENSTLDRHLDFFEMLRNE 
DELEFAKRFELVHIDTKSATQMFELTRKRLTHSEAYPHFMSILH 
HCLQMPYKRSGNTVQYWLLLDRIIQQIVIQNDKGQDPDSTPLEN 
FNI KNWRML VNENEVKQWKEQAEKMRKEHNELQQKLEKKEREC 
DAKTQEKEEMMQTLNKMKEKLEKETTEHKQVKQQVADLTAQLHE 
LSRRAVCASIPGGPSPGAPGGPFPSSVPGSLLPPPPPPPLPGGM 
LPPPPPPLPPGGPPPPPGPPPLGAIMPPPGAPMGLALKKKSIPQ 
PTNALKS FNWS KLPENKLEGTVWTEIDDTKVFKI LDLEDLERTF 
SAYOROODFFVWSNSKOKEADAIDDTL.S < 3ICT.TrV7K'P , T Q\/Tnr«ooA 

QNCNILLSRLKLSNDEIKRAILTMDEQEDLPKDMLEQLLKFVPE 
KSDIDLLEEHKHELDRMAKADRFLFEMSRINHYQQRLQSLYFKK 
KFAERVAEVKPKVEAIRSGSEEVFRSGALKQLLEWLAFGNYMN 
KGQRGNAYGFKISSLNKIADTKSSIDKNITLLHYLITIVENKYP 
SVLNLNEELRDIPQAAKVNMTELDKEISTLRSGLKAVETELEYQ 
KSQPPQPGDKFVSWSQFITVASFSFSDVEDLLAEAKDLFTKAV 
KHFGEEAGKIQPDEFFGIFDQFLQAVSEAKQENENMRKKKEEEE 
RRARMEAQLKEQRERERKMRKAKENSEESGEFDDLVSALRSGEV 
FDKDLSKLKRNRKRITNQMTDSSRERPITKLNF 


6018 
6019 


13 
2 


2510 
1066 


TISQSGGIRRRREAVWFEWNMDFSRLHMYSPPQCVPENTGYTY 

ALSSSYSSDALDFETEHKLDPVFDSPRMSRRSLRLATTACTLGD 

GEA VGADS GTS SA VS L KNRAAR TTKQRR S TNKS A FS I NHVS RQ V 

TS SGVS YGGTVSLQDAVTRRP PVLDES W I REQTTVDHFWGLDDD 

GDLKGGNKAAIQGNGDVGAGAATGHNGFFCSNCNMLSERKDVLT 

AH PAAPG P VS R V YS RD RNQ KCDD CKG KRHLDAH PG RAGTLWH I W 

ACAG Y FLLQ ILRR IGAVGQAVS RTAWS ALWLAWA PGKAASGVF 

WWLGIGWYQFVTLISWLNVFLLTRCLRNICKFLVLLIPLFLLLG 

LSLRGQG\NFFSFLPVLNWASMHRTQRVDDPQDVFKPTTSRLKQ 

PLQGDSEAFPWHWMSGVEQQVASLSGQCHHHGENLRELTTLLQK 

LQARVDQMEGGAAGPSASVRDAVGQPPRETDFMAFHQEHEVRMS 

HLEDILGKLREKSEAIQKELEQTKQKTISAVGEQLLPTVEHLQL 

ELDQLKSELSSWRHVKTGCETVDAVQERVDVQVREMVKLLFSED 

QQGGSLEQLLQRFSSQFVSKGDLQTMLRDLQLQILRNVTHHVSV 

TKQLPTS EAWSAVS EAGASG I TEAQARAI VNSALKL YSQDKTG 

MVDFALESGGGSILSTRCSETYETKTALMSLFGIPLWYFSQSPR 

WIQPDIYPGNCWAFKGSQGYLWRLSMMIHPAAFTLEHIPKTL 

SPTGNISSAPKDFAVYGLENEYQEEGQLLGQFTYDQDGESLQMF 

QALKRPDDTAFQIVELRIFSNWGHPEYTCLYRFRVHGEPVK 

T PNDREP P VQR PPS S RRASHLAQE I TS AASLGDQTQ I LGS LTTA 
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ID 
NO: 


Predicted 
beginning 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Arcij.no acid sscjment containina eianal r><*r**- i A*± 
(A=Alanine, C=Cysteine, D-Aepartic Acid, E« 
Glutamic Acid, F« Phenylalanine, G«Glycine, 
H=Histidine, I«Ieoleucine, K=Lysine, 
L=Leucine , M>=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, x=Unknown, *=Stop 
Codon, /^possible nucleotide deletion 
\=possible nucleotide insertion) 








PVITSAIRSMPGISSQILTNAQGQVIGTLPWWNSASVAAPAPA ' 
QS LQVQAVTPQLLLNAQGQ V I ATLAS S PLP PPVAVR K \ PSTPES 
LLKSEVQPIKPTPTVPQPAWIASPAPAAKPSASAPIPITCSET 
PTVSQLVSKPHTPSLDEDGINLEEIREFAKNFKIRRLSLGLTQT 
QVGQALTATEGPAYSQSAICRFEKLDITPKSAQKLKPVLEKWLN 
EAE LRNQEGQQNLME FVGGE PS KKRKRRTS FTPQAI EALNAY FE 

KNPLPTGQEITEIAKELNYDREVVRWFCNRRQTLKNTSKLNVF 
QIP 


6020 


4953 


549 


EAIQFEVSIGNYGNKFDTTCKPLASTTQYSRAVFDGNYYYYLPW 
AHTKPWTLTSYWEDISHRLDAVNTLLAMAERLQTNIEALKSGI 
QG K I PANQLAEL W L KL I DE V I EDTRYTL P LT5G KANVT VLDTQ I 
RKLRSRSLSQ I HEAAVRMRSE ATD VKSTLAE I EDWLDKLMQLTE 
EPQNSMPDIIIWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 
GKTQTIFLKYPQEKNNGPKVPVELRVNIWLGLSAVEKKFNSFAE 
G T FT VF AE M Y ENQALM FG KWGT S G L VGRH K FS D VTG K I KLKRE F 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDEKGWEYGITIPPDHKPKSWVAAEKMYHTHRRRRLVRKRKKD 
LTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPSETHGAAAIFKLEGALGADTTEDGDEKSLEKQKHSA 
TT VFGANTP I VSCNFDRD YI YHLRC YVYQARNLLALDKDS FSDP 
YAHICFLHRSKTTEI IHSTLNPTWDQTI I FDEVEIYGEPQTVLQ 
NPPKVIMEL FDNDQ VG KD E FLGRS I FS P WKLNS EMD I TP KLLW 
HP VMNGDKACGDVL VTAELI LRGKDGSNLP I LP PQRAPNL YMVP 
QGIRPWQLTAIEILAWGLRNMKNFQMASITSPSLWECGGERV 
ESWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
DIVIEMEDTKPLLASKCLSSMSTALSKMASPATVHLTEKEE3IV 
DWWSKFYASSGEHEKCGQYIQKGYSKLKIYNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKITLG 

KKVIE\DRDHYIPNTLNPVFGRMYELSCYLPQEKDLKISVYDYD 
TFTRDE KVGET 1 1 DLENPP\Ti«?R PR \ QHPn\ TDVi?vr*\tcr>\rKiT>™ 

RDSLR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSLDE 
FEANKILHQHLGAPEERLALHILRTQGLVPEHVETRTI.HSTFQP 
NIS\RYYLRVI IWNTKDVILDEKSITGEEMSDIYVKGWIPGNEE 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
S I DQTEFRI PPR\LIIQIW\ DNDKFS \ LDDYLGFPRTLTCRHT I 
HFLQKS PGGNC/RGLDM I PDLKAMNPLKAKTASLFEQKSMKGWW 
PCYAE KDG AR VMAG KVEMTLE I LNE KEADERPAG KGRDE PNMNP 
KLDLPNRPETS FLWFTNPCKTMKFI VWRRFKWVI IGLLFLL I LL 
LFVAVLLYSLPNYLSMKIVKPNV 


6021 


4953 


549 


EAIQFEVSIGNYGNKFDTTCKPLASTTQYSRAVFDGNYYYVLPW 
AHTKPWTLTSYWEDISHRLDAVNTLLAMAERLQTNIEALKSGI 
QGKIPANQLAELWLKLIDEVIEDTRYTLPLTEGKANVTVLDTQI 
RKLRSRSLSQ I HEAAVRMRSE ATD VKSTLAE I EDWLDKLMQLTE 
EPQNSMPDI I IWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 
GKTQT I FLKYPQE KNNG PKVPVE LRVNI WLGLSAVE KKFNSFAE 
GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 
FLPPKGWEWEGEWIVDPERSLLTEADAGHTEFTDEVYQNESRYP 
GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 
AVDEKGWEYGITIPPDHKPKSWVAAEKMYHTHRRRRLVRKRKKD 
LTQTASSTAGAMEELQDQEGWEYASLIGWKFHWKQRSSDTFRRR 
RWRRKMAPSETHGAAAIFKLEGALGADTTEDGDEKSLEKQKHSA 
TT VFGANTP I VS CNFDRDYI YHLRC YVYQARNLLALDKDS FSDP 
YAHICFLHRSKTTEI IHSTLNPTWDQTI I FDEVEIYGEPQTVLQ 
NPPKVIMELFDNDQVGKDEFLGRS I FSPWKLNSEMDITPKLLW 
HP VMNGDKACGDVL VTAELI LRGKDGSNLP I LP PQRAPNL YMVP 
QG I RP WQLTAIE I LAWGLRNMKNFQMAS ITS PSLWECGGERV 
ESWIKNLKKTPNFPSSVLFMKVFLPKEELYMPPLVIKVIDHRQ 
FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine / 
H«Histidine, I=Isoleucine, K-Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W-Tryptophan, Y«Tyrooine, X^Unknown, *~3top 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








D I VI EMEDTKPLLAS KCLS SMS TALSKMAS PATVHLTEKEEE I V 
DWWSKFYASSGEHEKCGQYIQKGYSFCLK1YNCELENVAEFEGLT 
DFSDTFKLYRGKSDENEDPSVVGEFKGSFRIYPLPDDPSVPAPP 
RQFRELPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKITLG 
KKVIE\DRDHYIPNTLNPVFGRMYELSCYLPQEKDLKISVYDYD 
TFTRDEKVGETI IDLENPF\LSRFG\SHCG\ IPEEYCVSGVNTW 
RDSLR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSLDE 
FEANKILHQHLGAPEERLALHILRTQGLVPEHVETRTLHSTFQP 
NIS\RYYLRVIIWNTKDVILDEKSITGEEMSDIYVKGWIPGNEE 
NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 
SIDQTEFR1PPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 
HFLQKSPGGNC/RGLDMIPDLKAMNPLKAKTASLFEQKSMKGWW 
PCYAEKDGARVMAGKVEMTLEILNEKEADERPAGKGRDEPNMNP 
KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKVJVIIGLLFLIiILL 
LFVAVLLYSLPNYLSMKIVKPNV 


6022 


4953 


549 


EAIQFEVS IGNYGNKFDTTCKPLASTTQYSRAVFDGKfYYYYLPW " 

AHTKPWTLTSYWEDISHRLDAVNTLLAMAERLQTNIEALKSGI 

QGKIPANQLAELWLKLIDEVIEDTRYTLPLTEGKANVTVLDTQI 

RKLRS RSLS Q I HEAAVRMRS EATDVKSTLAE I ED WLDKLMQLTE 

EPQNSMPDI I IWMIRGEKRLAYARIPAHQVLYSTSGENASGKYC 

GKTQTIFLKYPQEKNNGPKVPVELRVNIWLGLSAVEKKFNSFAE 

GTFTVFAEMYENQALMFGKWGTSGLVGRHKFSDVTGKIKLKREF 

FL P P KG W EW EG E W I VD P ER S LLTE AD AG HTE FTDE V YQNES R YP 

GGDWKPAEDTYTDANGDKAASPSELTCPPGWEWEDDAWSYDINR 

AVDEKGWEYGITIPPDHKPKSWVAAEKMYHTKRRRRLVRKRKKD 

LTQTAS S TAG AMEE LQDQEGW EYAS L IG WKFKWKQRSSDTFRRR 

RWRRKMAPSETHGAAAIFKLEGALGADTTEDGDEKSLEKQKHSA 

TTVFGANTPIVSCNFDRDYIYHLRCYVYQARNLLAIiDKDSFSDP 

YAH I CFLHRS KTTE I IHS TLNPTWDQT 1 1 FDE VE I YGEPQT VLQ 

NPPKVIMELFDNDQVGKDEFLGRSIFSPWKLNSEMDITPKLLW 

H P VMNGDKACGD VL VTAEL I LRGKDG S NLP I L P PQRAPN L YM VP 

QG IRP WQLTAI E I LAWGLRNMKNFQMAS I TS PSL WECGGERV 

ESWI KNLKKTPNFPSSVLFMXVFLPKEEL YMPPLVI KVIDHRQ 

FGRKPWGQCTIERLDRFRCDPYAGKEDIVPQLKASLLSAPPCR 

DIVIEMEDTKPLLASKCLSSMSTALSKMASPATVHLTEKEEEIV 

DWWSKFYASSGEHEKCGQYIQKGYSKLKIYNCELENVAEFEGLT 

DFSDTFKLYRGKSDENEDPSWGEFKGSFRIYPLPDDPSVPAPP 

RQFRELPDSVPQECTVRIYIVRGLELQPQDNNGLCDPYIKITLG 

KKVIE \ DRDH Y I PNTLNPVFGRM YELS CYLPQEKDLKI S VYD YD 

TFTRDE KVGETI I DLENPF\LSRFG\SHCG\ IPEEYCVSGVNTW 

RDSLR\PTQ\LLQNVARFKGFPQPILSEDGSRIRYGGRDYSLDE 

FEANKILHQHLGAPEERLALHILRTQGLVPEHVETRTLHSTFQP 

NIS\RYYLRVIIWNTKDVILDEKSITGEEMSDIYVKGWIPGNEE 

NKQKTDVHYRSLDGEGNFNWRFVFPFDYLPAEQLCIVAKKEHFW 

SIDQTEFRIPPR\LIIQIW\DNDKFS\LDDYLGFPRTLTCRHTI 

HFLQKSPGGNC/RGLDMIPDLKAMNPLKAKTASLFEQKSMKGWW 

PCYAEKDGARVMAGKVEMTLEILNEKEADERPAGKGRDEPNMNP 

KLDLPNRPETSFLWFTNPCKTMKFIVWRRFKWVIIGLLFLLILL 

LFVAVLLYSLPNYLSMKIVKPNV 


6023 


102 


916 


SQELGMFVELNNLLNTTPDRAEQGKLTLLCDAKTDGSFLVHHFL 
SFYLKANCKVCFVALIQSFSHYSIVGQKLGVSLTMARERGQLVF 
IiEGL/IVCSGR\VFQAQKEPHPLQFLREANAGNLKPIiFEFVREA 
LKPVDSGEARWTYPVLLVDDLSVLLSLGMGAVAVLDFIHYCRAT 
VCWELKGNMWLVHDSGDAEDEENDILLNGLSHQSHLILRAEGL 
ATG FCRD VHG Q LR I LWRR PS QPAVHRDQS FT YQ Y KI QDKS VS F F 
AKGMSPAVL 


6024 


3 


3260 


FLSFLCYPRFRCLFCLQFAIPASRMEQLNELELLMEKSFWEEAE 
L P AELFQ KKWAS F PRT VLS TGMDNR YL VLAVNTVQNKE GNCE K 
RLVITASQSLENKELCILRNDWCSVPVEPGDIIHLEGDCTSDTW 
1 1 DKDFG YL I LYPDMLI SGTS IAS S I RCMRRAVLS ETFRS S DPA 
TRQMLIGTVIiHEVFQKAINNSFAPEKLQELAFQTIQEIRHLKEM 
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SEQ 
ID 
NO: 


Jf X CUlttCU 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


rreuictea ena 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








YRLNLSQDEI KQEVEDYLPSFCKWAGDFMHKNTSTDFPQMQLSL 
PSDNSKDNSTCNIEWKPMDIEESIWSPRFGLKGKIDVTVGVKI 
HRGYKTKYKIMPLELKTGKESNSIEHRSQWLYTLLSQERRADP 
EAGLLLYLKTGQMYPVPANHLDKRELLKLRNQMAFSLFHRISKS 
ATRQKTQLASLPQIIEEEKTCKYCSQIGNCALYSRAVEQQMDCS 
S VP I VML P K I E E ETQH LKQTHLE Y FS LW CLMLTLE S QS KDNKKN 
HQNIWLMPASEMEKSGSCIGNLIRMEHVKIVCDGQYLHNFQCKH 
GAIPVTNLMAGORVIVSGEERSLFALSRGYVKEINMTTVTCLLD 
RNLSVLPESTLFRLDQEEKNCDIDTPLGNLSKLMENTFVSKKLR 
DLIIDFREPQFISYLSSVLPHDAXDTVACILKGLNKPQRQAMKK 
VLL S KD YTL I VG M PGTG KTTT I CTL VRI L YACG FS VLLTS YTHS 
AVDNILLKLAKFKIGFLRSR\QIQKVHPAIQQFTEHEICRSKSI 
KS\LALLEELYTSQLIDATTCMGINHPIFSRKIFDFCIVDEASQ 
ISQPICLGPLFFSRRFVLVGDHQQLPPLVLNREARALGMSESLF 
KRLEQNKSAWQLTVQYRMNSKIMSLSNKLTYEGKLECGSDKVA 
NAVINLRHFKDVKLELEFYADYSDNPWLMGVFEPNNPVCFLNTD 
KVPAPEQVEKGGVSNVTEAKLIVFLTSIFVKAGCSPSDIGIIAP 
YRQQLKI INDLLARS IGMVEVNTVDKYQD\RDKS I VLVS FVRSN 
KDGTVGELLKDWRRLNVAITRAKHKLILLGCVPSLNCYPPLEKL 
LNHLNSEKLI IDLPSREHESLCHILGDFQRE 


6025 


3977 


89 


GGFPAQSDHLPPVFPLRSDLLITMSTLYVSPHPDAFPSLRALIA 

ARYGEAGEGPGWGGAHPRICLQPPPTSRTSFPPPRLPALEQGPG 

GLWVWGATAVAQLLWPAGLGGPGGSRAAVLVQQWVSYADTELIP 

AACGATLPALGLRSSAQDPQAVLGALGRALSPLEEWLRLHTYLA 

GE APTLADLAAVTALLLPFR Y VLD P PARR I WNNVTR WF VTGVRQ 

PE FRAVLGE WL YSGAR P hS HQ PGP E APALP KTAAQLKKE AKKR 

EKLEKFQQKQKIQQQQPPPGEKKPKPEKREKRDPGVITYDLPTP 

PGEKKDVSGPMPDSYSPRYVEAAWYPWWEQQGFFKPEYGRPNVS 

AANPRG V FMM C I P P PNVTGS LHIiGHALTNA I QDS LTRWHRMRG E 

TTLWNPGCDHAGIATQVWEKKLWREQGLSRHQLGREAFLQEVW 

KWKEEKGDRIYHQLKKLGSSLDWDRACFTMDPKLSAAVTEAFVR 

LHEEGIIYRSTRLVNWSCTLNSAISDIEVDKKELTGRTLLSVPG 

Y KEKVE FG VLVS FAYKVQG S DS DEE WVATTR I E TM LGDVAVAV 

H PKDTR YQHL KG KNVI HPFLSRSLPIVFDE FVDMDFGTGA VKI T 

PAHDQNDYEVGQRHGLEAISIMDSRGALINVPPPFLGLPRFEAR 

KAVLVAL KERG L FRG I EDN PM WPL CNRS KDWEPLLRPQWYVR 

CGEMAQAASAAVTRGDLRILPERHQRTWHAWMDNIRE\WCMFPG 

KL WWG \ HR \ I PAYF VTVSD PAVPPGEDPDGRYWVSGRNEAEARE 

KAAKE FG VS PDKI S LQQDED VLDTWFS SGLFPLS I LG WPNQS ED 

LSVFYPGTLLETGHDILFFWVARMVMLGLKLTGRLPFREVYLHA 

I VRDAHGRKMSKSLGNVIDPLDVIYGISLQGLHNQLLNSNLDPS 

EVEKAKEGQKADFPAGIPECGTDALRFGLCAYMSQGRDINLDVN 

R I LGYRHFCNKLWNATKFALRGLGKGFVPS P TSQ PGGHES L VDR 

WIRSRLTEAVRLSNQGFQAYDFPAVTTAQYSFWLYELCDVYLEC 

LKPVLNGVDQVAAECARQTLYTCLDVGLRLLSPFMPFVTEELFQ 

RLPRRMPQAPPSLCVTPYPEPSECSWKDPEAEAALELALSITRA 

VR P \ LRAD YNLH P E S G PTC FLE VAD\EATGALASAVS G YVQG PG 

QAQWVAVAEPWGLPAP\QGCAVALASDRCSI\HLQLQG\LLDP 

ARE LG \ KLQ \ AKR VE AQ \ RQAQ \ RLR \E RRA\ ASGN P VKVPL \ E 

VQEADE AKLQQTE AE LRKVDE A I ALFQKML 


6026 


2674 


514 


GPITFLKKKAKMKDMPLRIHVLLGLAITTLVQAVDKKVDCPRLC 
i iKF wr i irKblYMEASTVDCNDLGLLTFPARLPANTQILLLQ 
TNNIAKIEYSTDFPVNLTGLDLSQNNLSSVTNINGKKMPQLLSV 
YLEENKLTELPEKCLSELSNLQELYINHNLLSTISPGAFIGLHN 
LLRLHLNSNRLQMINSKWFDALPNLEILMIGENPIIRIKDMNFK 
PL INLRSLVI AGINLTE I PDNALVGLENLES IS F YDNRLI KVPH 
VALQKWNLKFLDLNKN PINRI RRGDFSNMLHLKELG INNMPEL 
ISIDSLAVDNLPDLRKIEATNNPRLSYIHPNAFFRLPKLESIWL 
NSNALSALYHGTIESLPNLKEISIHSNPIRCDCVIRWMNMNKTN 
IRFMEPDSLFCVDPPEFQGQNVRQVHFRDMMEICLPLIAPESFP 
SNLNVEAGSYVSFHCRATA\EPQPEIYWITPSGQKLLPNT\LTD 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid, segment containing signal peptide 
(A=Alanine, C=Cysteine, D«=Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine , G=Glycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L»Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KFYVHSEGTLDINGVTPKEGQLYTCIATNLVGADLKSVMIKVDG"" 
S FPQDNNGSLNI KIRDIQANSVLVSWKASSKILKS SVKWTAFVK 
TE NS HAAQS AR IPS D VKV YNLTHLN PSTEYKICIDIP T I YQ KNR 
KKCVim'TKGLHPDQKEYEKNNTTTLMACLGGLLGIIGVICLIS 
CLSPEMNCDGGHSYVRNYLQKPTFALGELYPPLINLWEAGKEKS 
TS L KVKATVI GLP TNMS 


c n*D *7 


5254 


4148 


GGRRAPGRPGRS I KDEEEETVFRE WS FS PDPLPVRYYDKDTTK 
P I S FY LS S LEELLAWKPRLEDG FNVALE PLACR QP PLS SQR PR T 
LLCHDMMGGYLDDRFIQGSVVQTPYAFYHWQCIDVFVYFSHHTV 
TIPPVGWTNTAHRHGVCVLGTFITEWNEGGRLCEAFLAGDERSY 
QA VADR L VQ I T \ RF FR F DG WL I N I ENS LS LAAVGNM P P FLR YLT 
TQLHRQVPGGLVLWYDSWQSGQLKWQDELNQHNRVFFDSCDGF 
FTNYNWREEHLERMLGQAGERRADVYVGVDVFARGNWGGRFDT 

DKVGGGFRPRASGPVPPLGPHFLMDLPFPSAPQRNDSSCSSQSG 
DP VALRNRCPAPAKL CPH 


6028 


120 


3432 


NCLLLQAKGFHGEIEDLQQWLTDTERHLLASKPLGGLPETAKEQ 
LNVHMEVCAAFEAKEETYKSLMQKGQQMLARCPKSAETNIDQDI 
NNLKEKWESVETKLNER\KT\KLEEALNLA\MEFHNSL\QDFIN 
WLTQAEQTLNVASRPSLILDTVLFQIDEHKVFANEVNSHREQII 
ELDKTGTHLKYFSQKQDWLIKNLLISVQSRWEKWQRLVERGR 
SLDDARKRAKQFHEAWSKLMEWLEESEKSLDSELEIANDPDKIK 
TQLAQHKEFQKSLGAKHSVYDTTNRTGRSLKEKTSLADDNLKLD 
DMLSELRDKWDTICGKSVERQNKLEEA\LLFSGQFTDALQALID 
WLYRVEPQLAEDQPVHGDIDLVMNLIDNHKAFQE03LGKRTSSVQ 
ALKRS ARE L I E GSRDDS S WVKVQMQELSTRWETVCALS I SKQTR 
LEAALRQAEEFHSVVHALLEWLAEAEQTLRFHGVLPDDEDALRT 
LIDQHKEFMKKLEEKRAELNKATTMGDTVLAICHPDSITTIKHW 
I T 1 1 RARFEE VLAWAKQHQQRLAS ALAGL IAKQELLEALLAWLQ 
WAETTLTDKDKE VI PQE I EEVKAL I AEHQTFMEEMTRKQPD VDK 
VTKTYKRRAADPSSLQSHIPVLDKGRAGRKRFPASSLYPSGSQT 
QIETKNPRVNLLVSKWQQVWLLALERRRKLNDALDRLEELREFA 
NFDFDIWRKKYMRWMNHKKSRVMDFFRRIDKDQDGKITRQEFID 
GILSSKFPTSRLEMSAVADIFDRDGDGYIDYYEFVAALHPNKDA 
YKPITDADKIEDEVTRQVAKCKCAKRFQVEQIGDNKYRFFLGNQ 
FGDSQQLRLVR I LRSTVM VRVGGGWMALDEFLVKNDPCRAKGRT 
NMELREKFILADGASQGMAAFRPRGRRSRPSSRGASPNRSTSVS 
SQAAQAASPQVPATTTPKILHPLTRNYGKPWLTNSKMSTPCKAA 
ECSDFPVPSAEGTPIQGSKLRLPGYLSGKGFHSGEDSGLITTAA 
ARVRTQFADSKKTPSRPGSRAGSKAGSRASSRRGSDASDFDISE 
IQSVCSDVETVPQTHRPTPRAGSRPSTAKPSKI PTPQRKSPASK 
LDKSSKR 


6029 


1 


3533 


I M PCGS S R LL RGC WTHPNE P VS DLS Y FD CI ES VMENS KVLG ES M 
AG I S QNAKTGDL P AFGEC VG I AS KAL CG LTEAAAQAA YL VG I FD 
PNSQAGHQGLVDPIQFARANQAIQMACQNLVDPGSSPSQVLSAA 
T I VAKHTS ALCNACR IAS S KTANPVAKRHFVQSAKE VANSTANL 
VKTIKALDGDFSEDNRNKCRIATAPLIEAVENLTAFASNPEFVS 
IPAQISSEGSQAQEPILVSAKPMLESSSYLIRTARSLAINPKDP 
P T WSVLAGHS HT VS DS I KSL ITS I RDKAPGQRE CD YS IDG I NRC 
IRDIEQASLAAVSQSLATRDDISVEALQEQLTSWQEIGHLIDP 
I ATAARG EAAQLGHKGTQLAS Y FE PL I LAAVGVAS KI LDHQQQM 
TVLDQTKTLAESALQMLYAAKEGGGNPKAQHTHDAITEAAQLMK 
EAVDDIMVTLNEAASEVGLVGGMVDAIAEAMSKLDEGTPPEPKG 
TFVDYQTTWKYSKAIAVTAQEMMTKSVTNPEELGGLASQMTSD 
YGH LAFQ G QMAAATAE PEE I GFQ I RTRVQDLGHG C I FL VQKAG \ 
ALQVCPTDS YTKRELI ECARAVTE KVS LVLSALQAGNKGTQAC I 
TAATAVS G 1 1 ADLDTT I MFATAG TLNAENS ET FADHREN I LKTA 
KALVEDTKLLVSGAASTPDKLAQAAQSSAATITQLAEWKLGAA 
SLGSDDPETQWLINAI KDVAKALSDLISATKGAASKPVDDPSM 
YQLKGAAKVMVTNVTS LLKTVKAVEDEATRGTRALEATIECI KQ 
ELTVFQSKDVPEKTSSPEESIRMTKGITMATAKAVAAGNSCRQE 
D V I ATANLS R KAVS DM LTACKQ AS FHPD VS DE VRTRALR FGTE C 
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SEQ 
ID 
NO: 


beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ricUXULcU fculQ 

nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine, K^Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, V=Valine, 

Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLG YLDLLEH VLV I LQKPTPELKQQLAAFS KRVAGAVTELIQAA 
EAMKGTEWVD PEDPTVI AETELLGAAAS I EAAAKKLEQLKPRAK 
P KQADE TLDFE EQ I L EAAKS I AAATS AL VKS AS AAQREL VAQG K 
VGS I PANAADDGQWSQGLISAARMVAAATSSLCEAANAS VQGHA 
SEEKLI S S AKQVAAS TAQLLVACKVKADQDS E AMRR LQAAGNAV 
KRAS DNL VRAAQ KAAFGKADDD DWVKTKFVGG I AQ 1 1 AAQEEM 
LKKERELEEARKKLAQIRQQQYKPLPTELREDEG 


6030 


3 


1777 


FPGRGSPALQLEVLI CLGLMGLERALNVLAPI FYRNI VNLLTEN 
APWNSLAWTVTSYVFLKFLQGGGTGSTGFVSNLRTFLWIRVQQF 
TSRRVELLIFSHLHELSLRWHLGRRTGEVLRIADRGTSSVTGLL 
SYLVFNVIPTLADI I IGI I YFSMFFNAWFGLI VFLCMSLYLTLT 
I WTE WRT KFRRAMNTQE NATRARAVDSLLN FE TVK Y YNAES YE 
VERYREAI IKYQGLEWKSSASLVLLNQTQNLVIGLGLLAGSLLC 
AY F VTEQ KLQ VGD YVL FGT Y 1 1 QL YM PLNWFGT YYRM I QTNFI D 
MENMFDLLKK\ETEVKDLPGAGPFRFQKGRIEFENVHFSYADGR 
ETLQDVSFTVMPGQTLALVGPSGAGKSTILRLLFRFYDISSGCI 
RIDGQDISQVTQALFRFSHWELCPKDTVLFNDTIADNIRYGRVT 
AGNDE VEAAAQAAG I HDAI MAFP EGYRTQVGERGLKLSGGEKQR 
VAIARTILKAPGIILLDEATSALDTSNERAIQASLAKVCANRTT 
IWAHRLSTWNADQILVIKDGCIVERGRHEALLSRGGVYADMW 
QLQQGQEETSEDTKPQTMER 


6031 


160 


1694 


LRMSENLDKSNVNEAGKSKSNDSEEGLEDAVEGADEALQKAIKS 
DSSSPQRVQRPHSSPPRFVTVEELLETARGVTNMALAHEIVVNG 
us VbJjFbNSLKKRVKEIVHKAFWDCLSVQLSEDPPAYDHA 
I KLVG E I KETLLS FLLPGHTRLRNQ I TEVLDLDL I KQE AENGAL 
DISKLAEFIIGMMGTLCAPARDEEVKKLKDIKEIVPLFREIFSV 
LDLMKVDMANFAI S S I RPHLMQQS VE YERKKFQE I LERQPNS LD 
FVTQWLEEAS EDLMTQKYKHALP VGGMAAGSGDMPRLS PVAVQN 
YAYLKLL.KWDHLQRPFPETVLMDQSRFHELQLQ\REQLTILGAV 
LL VT FS MAAPG I S S Q AD FAE KLKM I VK I LLTDMHL P S FHLKD VL 
TTIGEKVCLEVSSCLSLCX3SSPFTTDKETVLKGQIQAVASPDDP 
IRRIMESRILTFLETYLASGHQKPLPTVPGGLSPVQRELEEVAI 
KFARLVNYNKMVFCPYYDAILSKILVRS 


6032 


39 


2415 


AARLCRAQPTKSAWMIRDLSKMYPQTRHPAPHQPAQPFKFTISE 
SCDRIKEEFQFLQAQYHSLKLECEKLASEKTEMQRHYVMYYEMS 
YGLN I EMHKQAE I VKRLNAI CAQVI P FLSQEHQQQ WQAVERAK 
QVTMAELNAI IGQQQLQAQHLSHGHGLPVPLTPHPSGLQPPAIP 
PIGSSAGLLALSSALGGQSHLPIKDEKKHHDNDHQRDRDSIKSS 
S VS PSAS FRGAEKHRNSAD YS S ESKKQKTEEKE I AARYDSDGEK 
SDDNLVVD VSNEDPSSPRGSPAHSPRENGLDKTRLLKKDAP IS P 
AS IAS SS S TPS S KS KELSLNE KSTTP VS KSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPLASSLRTPMAVPCPYPTPFGIVPHAG 
M NuJS It I o rljAA I AUiiHN 1 fa V QMa AAAAAAAAAAAYGRS P WGFD 
P HHHMRVP A IPPNLTGIPGGK PAYS FHVS ADGQMQ P VP FP P DAL 
IGPG I PRHARQ INTLNHGE WCAVT I SNPTRHVYTGGKGCVKVW 
D I SHPGNKS PVSQLDCLNRDN Y I RSCRLLPDGRTL I VGGEASTL 
SIWDLAAPTPRIKAELTSSAPACYALAISPDSKVCFSCCSDGNI 
AVWDLHNQTLVRQFQGHTDGASCIDISNDGTKLWTGGLDNTVRS 
W \ DLREG RQLQQHD / F FTS P VFS LGY CP \ TEE WLAVGMENSN \ V 
EVLHVTKPDKYQLHLHESCVLSLKFAHCGKWF\VSTGKDNLLNA 
W\RTPYG\ASIF\QSKESSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6033 


39 


2415 


AARLCRAQPTKSAWMIRDLSKMYPQTRHPAPHQPAQPFKFTISE 
SCDRIKEEFQFLQAQYHSLKLECEKLASEKTEMQRHYVMYYEMS 
YGLNIEMHKQAEIVKRLNAICAQVIPFLSQEHQQQWQAVERAX 
QVTMAELNAI IGQQQLQAQHLSHGHGLP VPLTPHPSGLQPPAI P 
PIGS SAGLLALSSALGGQSHL PI KDEKKHHDNDHQRDRDS I KS S 
SVS PSAS FRGAEKHRNSADYSS ESKKQKTEEKE I AARYDSDGEK 
S DDNL WDVS NED P S S P RGS PAHS PRENG LD KTRLLKKDAP ISP 
ASIASSSSTPSSKSKELSLNEKSTTPVSKSNTPTPRTDAPTPGS 
NSTPGLRPVPGKPPGVDPLASSLRTPMAVPCPYPTPFGIVPHAG 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino arid r p er m p n t* rnnhai' ni nn <a^<-»«a1 i. ■ Jj „ 
mtij.nu o'-au ocymcin. ^.uiiuaininy Si Qua j. peptide 

(A^Alanine, C=Cysteine, D-Aspartic Acid, E= 

Glutamic Acid, F»Phenylalanine, G=Glycine, 

H~Histidine, I»Isoleucine , K=Lysine, 

LsLeucine, M=Methionine, N=Asparagine, 

P=Proline, Q=Glutamine, R=Arginine, 

S=Serine, T=Threonine, V=Valine, 

W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 

Codon, /^possible nucleotide deletion, 

\=possible nucleotide insertion) 








i ji^^jejjj x o*. Ltril\ x ^ ir^l*Jo/\/\M/\/\>\/v\A/\/^ I IjKo Jr V VCirD 

PHHHMRVPAIPPNLTGIPGGKPAYSFHVSADGQMQPVPFPPDAL 
IGPGIPRHARQINTLNHGEWCAVTISNPTRHVYTGGKGCVKVW 
DISHPGNKSPVSQLDCLNRDNYIRSCRLLPDGRTLIVGGEASTL 
S I WDLAAPTPR I KAELTSSAPACYALAI S PDSKVCFSCCSDGNI 
AVWDLHNQTLVRQFQGHTDGASCIDISNDGTKLWTGGLDNTVRS 
W \ DLREGRQLQQHD/ FFTS PVFS LGYCP \TEEWLAVGMENSN\ V 
EVLHVTKPDKYQLHLHESCVLSLKFAHCGKWFWSTGKDNLLNA 

W\RTPYG\ASIF\QSKESSS\VLSCDI\SVDDKYIVTGS\GDK\ 
RATVYEVIY 


6034 


2683 


714 


ESGRRRRLKRRRSPCPGTAGGPGETNPGPGACPRGPREEAAAAM 
EIAPQEAPPVPGADGDIEEAPAEAGSPSPASPPADGRLKAAAKR 
VTFPSDEDIVSGAVEPKDPWRHAQNVTVDEVIGAYKQACQKLNC 
RQ I PKLLRQLQEFTDLGHRLDCLDLKGEKLDYKTCEALEEVFKR 
LQFKWDLEQTNLDE DGAS ALFDM I E Y YES ATHLN I S FNKH IGT 
RGWOAAAHMMRKTSr'TiOYT.\nARMTPT.T,nM<:aDTrv7 - ao»T o tdoc? 
liAVLHL E N AS LSGR P LMLLAT AL KMNMNLR E L YL \ ADNKLNGLQ 
DSAQLGNLLKFNCSLQILDLRNNHVLDSGLAYICEGLKEQRKGL 

VTlA VTiWNNfiTiTWTf2MfiT«*T.f*2MTT.DU r PncT ptt KTTy™ , irkTnTr«MT9/-iir 
v x u \ v unnnuui n j. orJrtr Ltori J. Xjrn.±\j£)±jtL 1 IjN liGHJNJ F I. GNEG V 

RHLKNGLISNRSVLRLGLASTKLTCEGAVAVAEFIAESPRLLRL 
DLRENE I KTGGLMALS LALKVNHS LLRLDLDRE PKKEAVKS FIE 
TQKALLAEIQNGCKRNLVLAREREEKEQPPQLSASMPETTATEP 
QPDDEPAAGVQNGAPSPAPSPDSDSDSDSDGEEEEEEEGERDET 
PSGAIDTRDTGSSEPQPPPEPPRSGPPLPNGLKPEFALALPPEP 
PPGPEVKGGSCGLEHELSCSKNEKELEELLLEASQESGQETL 


6035 


19 


404 


SVTYLGI I LHKNTGALPADPVQLI SQTPTPSTKQQLLS FLGMVG 
YFYLWIPGFAILTKPIjCKLTKENLADAIDPKSFSHSSFRSLKTA 
LEN AS TLALP DS S Q P F \ S LHTAE VQG C WE 1 LTQGLG P L P V 


6036 


1745 


356 


LPDVEFCLGRRRGRKMDSVEKGAATSVSNPRGRPSRGRPPKLQRN 
a vtt\Ks v IS ft. f e MLiAAli I Ii/\RGG S KG I P LKNIKHLAGVPLIGW 
VLRAALDSGAFQSVWVSTDHDEIENVAKQFGAQVHRRSSEVSKD 
S S TS LD A HE FLN YHNE VD I VGN I Q AT S P CLHPTDLQ KVAEM I R 
EEG YDS VFS WRRHQ FR WS E I QKG VREVTE PLNLNPAKR PRRQD 
WDGELYENGSFYFAKRHLIEMGYLQGGKMAYYEMRAEHSVDIDV 
DIDWPIAEQRVI.RYGYFGKEKLKEIKLLVCNIDGCLTNGHIYVS 
GDQKEIISYDVKDAIGISLLKKSGIEVRLISERACSKQTLSSLK 
LDCKMEVSVSDKLA\A/DEWRKEMGLCWKEVAYLGNEVSDEECLK 

rv v \j±j&\?n.irf\±JJ\\,0 InyAftVUI X l«Il^iVGG{\G/\ \ ± KCjC AdH 1 C 

MEKGLINFMPKNRNLAVNIGEKK 


6037 


2936 


1919 


WTSWWMSSVLTILLFSLQGNKMLNYSAPSAGGYLLPRKPVGTPA 
GGGFPRRHSVTLPSSKFRQNQLLSSLKGEPAPALSSRDSRFRDR 
S FS EGG E RLL P TQ KQPGGGQ VNS S R Y KT \ E L CR P FEENGAC K YG 
DKCQ FAHG I HEL RS LTRH P KY KTE LCRT FHT IGFCPYGPRCHFI 
HNAEERRALAGARDLSADRPRLQHSFSFAGFPSAAATAAATGLL 
DSPTSITPPPILSADDLLGSPTLPDGTNNPF\AFSSQELASLFA 

PSMflTiPOflR^PTTPT.PDDMQPCDUMCnCDDOnrvneT ctrvM?/nvT «■» 
njuirvjvjcrox' x i r Ljc Kt'iMoCiO irrlf^r roryiJoijiaDQEGXXjS 

SSSSSHSGSDSPTLDNSRRLPIFSRLSISDD 


603B 


1450 


426 


SSALQEFGTRNHTFGVPLPHRRKQIISCNICQLRFNSDSQAAAH 
YKGTKHAKKLKALEAMKNKQKSVTAKDSAKTTFTSITTNTINTS 
S DKTDGTAGTPA I S TTTTVE IRKS S VMTTE I TS K VEKSPTTATG 
NSSCPSTETEEEKAKRLL\YCSLCKVAVNSASQLEAHNSGTKHK 
TMLEARNGSGTI KAFPRAGVKGKGPVNKGNTGLQNKTFHCE ICD 
VHVNSETQLKQHISSRRHKDRAAGKPPKPKYSPYNKXjQKTAHPL 
GVKLVFS KEPS KPLAPRI LPNPLAAAAAAAAVAVS S PFSLRTAP 
AATLFQTSALPPALLRPAPGPIRTAHTPVLFAPY 


6039 


4073 


1000 


LDEYEARLTLANLDDFEEDNEDDDENRVNQEEKAAKITELINKL 
NFLDEAEKDLATVNSNPFDDPDAAELNPFGDPDSEEP ITETAS P 
RKTEDSFYNNSYNPFKEVQTPQYLNPFDEPEAFVTIKDSPPQST 
KKKNIRPVDMSKYLYADSSKTEEEELDESNPFYEPKSTPPPNNL 
VNPVQELETERRVKRKAPAPPVLSPKTGVLNENTVSAGKDLSTS | 
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SEQ 

XV 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCyeteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X=»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








PKPSPIPSPVLGRKPNASQSLLVWCKEVTKNYRGVKITNFTTSW 
RNGLSFCAILHHFRPDLIDYKSLNPQDIKENNKKAYDGFASIGI 
SRLLEPSPMVLLAIPDKLTVMTYLYQIRAHFSGQELNWQIEEW 
SSKSTYKVGNYETDTNSSVDQEKFYAELSDLXREPELQQPISGA 
VDFLSQDDSVFVNDSGVGESESEHQTPDDHLSPSTASPYCRRTK 
S DTE PQKSQQS SGRTSGSDDPG I CSNTDSTQAQVLLGKKRLLKA 
ETLELSDLYVSDKKKDMSPPFICEETDEQKLQTLDIGSNLEKEK 
LENSRSLECRSDPESPIKKTSLSPTSKLGYSYSRDLDLAKKKHA 
SLRQTESDPDADRTTLNHADHSSKIVQHRLLSRQEELKERARVL 
LEQARRDAALKAGNKHNTNTAAPFCNRQLSDQQDEERRRQLRER 
ARQL I AEARS GG KMS ELPS YGERAAE KLKERS KASGDENDNI E I 
DTNEEIPEGFWGGGDELTNLENDLDTPEQNSKLVDLKLKKLLE 
VQPQVANSPSSAAQKAVTESSEQDMKSGTEDLRTERLQKTTERF 
RNPWFSKDSTVRKTQLQSFSQYIENRPEMKRQRSIQEDTKKGN 
EEKAA I TETQRKPSEDE VLNKG FKDS \ SQ YWGELAALENEQKQ 
IDTRAALVEKRLRYLMDTGRNTEEEEAMKQEWFMLVNKKNALIR 
RMNQLSLLEKEHDLERRYELLNRELRAMLAIEDWQKTEAQKRRE 

QLLLDELVALVNKRDALVRDLDAQEKQAEEEDEHLERTLEQNKG 
KMAKKEEKCVLQ 


6040 


475 


1052 


PTALMTAPSCAFPVQFRQPSVSGLSQITKSLYISNGVAANNKLM " 
LSSNQITMVINVSVEWNTLYEDIQYMQVPVADSPNSRLCDFFD 
P I ADH I HS VEM KQGR \ TLLH CAAG VS RS AALCLA YLMKYHAMSL 
LD AH TWTKSCRPIIRPNSGFWEQLI H YE FQL FG KNT VH M VS S P V 
GM I P D I YEKE VRLM I PL 


6041 


2 


3686 

• 


TEKDEKTAHNLENVLIHFWERLSEICVAKISEPEADVESVLGVS 

NLLQVLQKPKGSLKSSKKKNGKVRPADEILESNKENEKCVSSEG 

EKIECWELTTEPSLTHNSSGLLSPLRKKPLEDLVCKLADISINY 

VNERKSEQHLRFLSTLLDSFSSSRVFKMLLGDEKQSIVQAKPLE 

IAKLVQKNPAVQFLYQKLIGWLNEDQRKDFGFLVDILYSALRCC 

DNDMERKKVLDDLTKVDLKWNSLLKIIEKACPSSDKHALVTPWL 

KGDILGEKLVNLADCLCNEDLESRVSSESHFSERWTLLSLVLSQ 

HVKNDYLIGDVYVERIIVRLHETLFKTKKLSEAESSDSSVSFIC 

DVAYNYFSSAKGCLLMPSSEDLLLTLFQLCAQSKEKTHLPDFLI 

CKLKNTWLSGVNLLVHQTDSSYKESTFLHLSALWLKNQVQASSL 

DINSLQVLLSAVDDLLNTLLESEDSYLMGVYIGSVMPNDSEWEK 

MRQSLPMQWLHRPLLEGRLSLNYECFKTDFKEQDIKTLPSHLCT 

SALLSKMVLIALRKETVLENNELEKIIAELLYSLQWCEELDNPP 

IFLIGFCEI LQKMN I T YDNLR VLGNMSGLLQ LLFNRS REHGTLW 

SLIIAKLILSRSISSDEVKPHYKRKESFFPLTEGNLHTIQSLCP 

FLS KEEKKEFSAQCI PALLGWTKKDLCSTNGGFGHLAI FNSCLQ 

TKS IDDGELLHGILKI I ISWKKEHEDIFLFSCNLSEASPEVLGV 

NIEIIRFLSLFLKYCSSPLAESEWDFIMCSMLAWLETTSENQAL 

YSIPLVQLFACVSCDLACDLSAFFDSTTLDTIGNLPVNLISEWK 

EFFSQGIHSLLLPI LVT VTGEN KD VS ET S FQNAMLK PM CE TL T Y 

ISKEQLLSHKLPARLVADQKTNLPEYLQTLLNTLAPLLLFRARP 

VQIAVYHMLYKLMPELPQYDQDNLKSYGDEEEEPALSPPAALMS 

LLSIQEDLLENVLGCIPVGOIVTIKPLSEDFCYVIiGYLLTWKLI 

LTFFKAASSQLRALYSMYLRKTKSLNKLLYHLFRLMPENPTYAE 

TAVEVPNKDPKTFFTEELQLS IRETTMLPYHI PHLACSVYHMTL 

KDLPAMVRLWWNSSEKRVFNIVDRFTSKYVSSVLSFQEISSVQT 

STQLFNGMTVKARATTREVMATYTIEDIVIELIIQLPSNYPLGS 

I I VESGKRVGVAVQQWRNWMLQLSTYLTHQNGS IMEGLALWKNN 

VDKRFEGVEDCM I CFSVIHGFNYSLPKKACRTCKKKFHSA\ CLY 

KWFTSSNKSTCSLCRETFF 


6042 


1306 


253 


MAEIAPASPSDIKASVSNGDTTLLCSRRQSCGMWEVRQVSLTYP ' 

GSPAPSHSLPLQPRSGGSLCPSRAW/PDPHQLFDDTSSAQSRGY 

GAQRAPGGLSYPAASPTPHAAFLADPVSNMAMAYGSSLAAQGKE 

LVDKNIDRFIPITKLKYYFAVDTMYVGRKLGLLFFPYLHQDWEV 

QYQQDTPVAPRFDVNAPDLYIPAMAFITYVLVAGLALGTQDRFS 

PDLLGLQASSAIAWLTLEVLAILLSLYLVTVNTDLTTIDLVAFL 

G YK YVG M I GG VLMGLLFGK I G Y YL VLG WCCVA I FVFM I RTLRLK 
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SEQ 
ID 
NU : 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 

ICDlUUC Ui. 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=s Tryptophan, Y=Tyrosine, X»Unknown ( *«Stop 
Codon, /=possible nucleotide deletion, 
\=poseible nucleotide insertion) 








I LADAAAEGVPVRGARNQLRMYLTMAVAAAQPMLM YWLTFHLVR 


6043 


403 


599 


LCLFFPFPCATPVLPliPSLISAL/CLSHLSVSSWFCPCQPPLPC 
PLPPLQNKTAKGSLSTEQSERG 


6044 


793 


412 


KLEMWNFTLISKVKIS RE VTM IAS KFG I GQQ VRH S LLG Y LG VW 
DIDPVYSLSEPSPDELAVNDELRAAPWYHWMEDDNGLPVHTYL 
AEAQLSSELQDEHP\EQPSMDELAQTIRKQLiQAPRLRN 


6045 


155 


2299 


SPLPQVAATWYLRRRLSDSNFMANLiPNGYMTDLQRPQPPPPPPG 
AHSPGATPGPGTATAERSSGVAPAASPAAPSPGSSGGGGFFSSL 
SNAVKQTTAAAAATFSEQVGGGSGGAGRGGAASRVLLVIDEPHT 
DWAK YFKGKK I HGB I D I KVEQAE FSDLNL VAHANGG FS VDME VL 
RNG VK WR S L KPD F VL I RQHAFSMARNGD YRS L V I GIiQ Y AG IPS 
VNSLHSVYNFCDKPWVFAQMVRLHKKLGTEEFPLIDQTFYPNHK 
EMLSS\TTYPVWKMGHGTLWGWGKVKVDNQHDFQDIASWALT 
KTYATAEPFIDAKYDVRVQKIGQNYKAYMRTSVSGNWKTNTGSA 
MLEQIAMSDRYKLWVDTCSEIFGGLDICAVEALHGKDGRDHIIE 
WGSSMPLIGDHQDEDKQLIVELWNKMAQALPRQRQRDASPGR 
GSHGQTPSPGALPLGRQTSQQPAGPPAQQRPPPQGGPPQPGPGP 
QRQGPPLQQRPPPQGQQHLSGLGPPAGSPLPQRLPSPTSAPQQP 
ASQAAPPTQGQGRQSRPVAGGPGAPPAARPPASPSPQRQAGPPQ 
ATRQTSVSGPAPPKASGAPPGGQQRQGPPQKPPGPAGPTRQASQ 
AGPVPRTGPPTTQQPRPSGPGPAGRPKPQLAQKPSQDVPPPATA 
AAGGPPHPQLNKSQSLTNAFNLPEPAPPRPSLSQDEVKAETIRS 
LRKSFASLFSD 


6046 


212 


1075 


EGLTGPCERVPFLLGRGPPHGATRAGHRRAVRWAGPESLPPLPR 
SLIMDSPRAGTHQGPLDAETEVGADRCTSTAYQEQRPQVEQVGK 
QAPLS PGLP AMGGPG PG PCEDPAGAGGAGAGGSE PLVTVT VQCA 
FTVALRARRGADLSSLRALLGQALPHQ\AQLGQLSYLAPGEDGH 
WVP I PEEESLQRAWQDAAACPRGLQLQCRGAGGRPVLYQWAQH 
SYSAQGPEDLGFRQGDTVDVLCEVDQAWLEGHCDGRIGIFPKCF 
WPAGPRMSGAPGRLPRSQQGDQP 


6047 


49 


1405 


PVLVTSLRMREADTLRPPQLMEVSADIISTVEFNHTGELLATGD 
KGGRWIFQREPESKNAPHSQGEYDVYSTFQSHEPEFDYLKSLE 
IEEKINKIKWLPQQNAAHSLLSTNDKTIKLWKITERDKRPEGYN 
LKDEEGKLKDLSTVTSLQVP VLKPMDLMVEVS PRR I FANGHTYH 
INSISVNSDCETYMSADDLRINLWHLAITDRSFTP\NIVDIKPA 
NMEDLTEVITASEFHPHHCNLFVYSSSKGSLRLCDMRAAALCDK 
HSKLFEEPEDPSNRSFFSEIIS\SVSDVKFSHSDRYMLTR\DYL 
TVKVWDL\NMEAR P I ETYQVHDYLRS KLCSL YENDCI FDKFECA 
WNG S DS V I MTGA \ YNN FFRM FDRNTKRD VTL \ E AS RES S KPRAV 

LKPRRVCVGGKRRRDDISVDSLDFTKKILHTAWHPAENIIAIAA 
TNNLYI FQDKVNSDMH 


6048 


1 


3194 


GIRTPKFCDSPTSDLEMRNGRGRGKRMRPNSNTPVNETATASDS 
KGTSNSS KTRAGANS KGRRGS QNSS EHRPPASSTS EDVKAS PS S 
ANKRKNKPLSDMELNSSSEDSKGSKRVRTNSMGSATGPLPGTKV 
EPTVLDRNCPSPVLIDCPHPNCNKKYKHINGLKYHQAHAHTDDD 
SKPEADGDSEYGEEPILHADLGSCNG\ASVSQK\GSLSPARSAT 
PKVRLVEPHSPSPSSKFSTKGLCKKKLSGEGDTDLGALSNDGSD 
DGPSVMDETSNDAFDSLERKCMEKEKCKKPSSLKPEKIPSKSLK 

S ARP I/APLAIPPQQI ytfqtat ftaas pg ss sg ltat vaq am p 

NSPQLKPIQPKPTVMGEPFTVNPALTPAKDKKKKDKKKKESSKE 
LE S PLTPG KVCRAE EG KS P FRE S SGNGMKM EGLLNGS S D PHQS R 
LASIKAEADKIYSFTDNAPSPSIGGSSRLENTTPTQPLTPLHW 
TQNGAEASSVKTNSPAYSDISDAGEDGEGKVDSVKSKDAEQLVK 
EGAKKTLFPPQPQSKDSPYYQGFESYYSPSYAQSSPGALNPSSQ 
AGVESQALKTKRDEEPESIEGKVKNDICEEKKPELSSSSQQPSV 
IQQRPNMYMQSLYYNQYAYVPPYGYSDQSYHTHLLSTNTAYRQQ 
YEEQQKRQSLEQQQRGVDKKAEMGLKEREAALKEEWKQKPSIPP 
TLTKAPSLTDLVKSGPGKAKEPGADPAKSVI I PKLDDS SKLPGQ 
APEGLKVKLSDASHLSKEASEAKTGAECGRQAEMDPILWYRQEA 
E PRMWT YVY PAKYSD I KSEDERWKEERDRKLKEERS RS KDS VPK 
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SEQ 
ID 

NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

to first 

amino acid 

ires "i r?iif» o-F 

amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G^Glycine, 
H=Histidine, I»Isoleucine, K=Lysine, 
L~Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *»Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








EDGKESTSSDCKLPTSEESRLGSKEPRPSVHVPVSSPLTQHQSY 

IPYMHGYSYSQSYDPNHPSYRSMPAVMMQNYPGSYLPSSYSFSP 

YGSKVSGGEDADKARASPSVTCKSSSESKALDILQQHASHYKSK 

SPTISDKTSQERDRGGCGWGGGGSCSSVGGASGGERSVDRPRT 

SPSQRLMSTHHHHHHLGYSLLPAQYNLPYAAGLSSTAIVASQQG 
STPSLYPPPRR 


6049 


215 


1089 


AMTGVFDRRVPSIRSGDFQAPFQTSAAMHHPSQBSPTLPESSAT 
DSDYYSPTGGAPHGYCSPTSASYG\KALNPYQYQYHGVNGSAGS 
YPAKAYADYSYASSYHQYGGAYNRVPSATNQPEKEVTEPEVRMV 
NGKP KKVR KP RT I YS S FQLAALQRR FQ KTQ YLAL P ERAELAAS L 
GLTQTQVKIWFQNKXSKIKKIMKNGEMPPEHSPSSSDPMACNSP 
QSPAVWEPQGSSRSLSHHPHAHPPTSNQSPASSYLENSASWYTS 
AASSINSHLPPPGSLQHPLALASGTLY 


OU3U 

— Sn^i 


do h 

tidci 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLGEPG\GSLGWVL " 
PNTAMKKKVLLMGKSGSGKTSMRSIIFANYIARDTRRLGATILD 
RIHSLQINSSIjSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESRELEKDMHY 
YQS CLE AI LQNS PDAK I FCLVH KMDLVQE DQRDL I FKEREEDLR 
RLSRPLECSCFRTSIWDETLYKAWSSIVYQLIPNVQQLEMNLRN 
FAE I I EADE VLLFERATFLVI S HYQCKEQRDAHRFEKI SNIIKQ 
FKLSCSKLAASFQSMEVRNSNFAAFIDIFTSNTYVMWMSDPSI 
; P S AAT L I N I RNARKH FE KL ER VDG P KQ CLLMR 


D U Z> ± 


boo 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLGEPG\GSLGWVL 
PNTAMKKKVLLMGKSGSGKTSMRSIIFANYIARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDNIFRNVEVLIYVFDVESRELEKDMHY 
YQSCLEAILQNSPDAKI FCLVH KMDLVQEDQRDLI FKEREEDLR 
RLSRPLECSCFRTSIWDETLYKAWSSIVYQLIPNVQQLEMNLRN 
FAE I I EADE VLLFERATFLVISH YQCKEQRDAHRFEKI SNIIKQ 
FKLSCSKLAASFQSMEVRNSNFAAFIDIFTSNTYVMWMSDPSI 
PSAATLINIRNARKHFBKLERVDGPKQCLLMR 




566 


1718 


KGLERTCCAMEESDSEKTTEKENLGPRMDPPLGEPG\6SLGWVL " 
PNTAMKKKVLLMGKSGSGKTSMRS 1 1 FAN YIARDTRRLGATILD 
RIHSLQINSSLSTYSLVDSVGNTKTFDVEHSHVRFLGNLVLNLW 
DCGGQDTFMENYFTSQRDNIFRNVEVLIYVPDVESRELEKDMHY 
YQS CLEAI LQNS PDAK I FCLVHKMDLVQEDQRDL I FKEREEDLR 
RLSRPLECSCFRTSIWDETLYKAWSSIVYQLIPNVQQLEMNLRN 
FAE I I EADEVLLFERATFLVISH YQCKEQRDAHRFEKI SNI I KQ 
FKLSCSKLAASFQSMEVRNSNFAAFIDI FTSNTYVMVVMSDPS I 
PSAATLINIRNARKHFEKLERVDGPKQCLLMR 


6053 


201 


1704 


KGTEMNKSRWQSRRRHGRRSHQQNPWFRLRDSEDRSDSRAAQPA ~ 

HDSGHGDDESPSTSSGTAGTSSVPELPGFYFDPEKKRYFRLLPG 

HNNCNPLTKESIRQKEMESKRLRLLQEEDRRKKIARMGFNASSM 

LRKSQLGFLNVTNYCHLAHELRLSCMERKKVQIRSMDPSALASD 

R FNL I LAD TNS DRLFT VND VTVGGS K YG 1 1 NLQS LKTPTLKV FM 

HENLYFTNRKV\NSVCWASLNHLDSHILLCLMGLAETPGCATLL 

PASLFVNSHPAGIDRPG\MLCSFRIPGAWSCAWSLNIQANNCFS 

TGLSRR VLLTNWTGHRQS FGTNSDVLAQQFALMAPLLFNGCRS 

GEIFAIDLRCGNQGKGWKATRLFHDSAVTSVRILQDEQYLMASD 

MAGKI KLWDLRTTKCVRQYEGHVNEYAYLPLHVHEEEG ILVAVG 

QDCYTRIWSLHDARLLRTIPSPYPASKADIPSVAFSSRLGGSRG 


6054 


1 


1054 


P P I ARLQE FGTSRRHMAAPSG VHLLVRRGSHRI FS S t>LlNHi YLH 
KQSSSQQRRNFFFRRQRDISHSIVLPAAVSSAHPVPKHIKKPDY 
VTTGIVPDWGDSIEVKNEDQIQGLHQACQLARHVLLLAGKSLKV 
DMTTEE I DALVHRE 1 1 SHNAY PS PLGYGG F PKS VCTS VNNVLCH 
GIPDSRPLQDGDIINIDVTVYYNGYHGDTSETFLVGNVDECGKK 
LVEVARRCRDEAIAACRAGAPFSVIGNTISHITHQNGFQVCPHF 
VGHGIGS YFHGHPE I WHHANDS DLPMEEGMAFT IEPII TEGS PE 
FKVLEDAWTWS LD / TS KVSAQ FEHTVL I TSRGAQ I LTKLPHEA | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


ran j_ i lu auxu ocy 1 lie Hi- LUiiLainilly Signal pBpcide 

(A=Alanine, C=Cysteine, D=Aspartic Acid, E- 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine , N^Asparagine , 
P=Proline, Q=Glutamine , R=Arginine, 
S^Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6055 


421 


2364 


PPYFLLSFLAWWLYGQSDRTETDISQSAGPPPGTLQCSALHHDP " 
GCANCSRFCRDCSPPACQCHTHVFPGNALNGVQPPELSRTLALI 
SSREPPRKKKKSQTBTGKERERTSFLTQGGKRFELQHGLAGICM 
TLLITGDS I VSAEAVWDHVTMANRELAFKA6DVIKVLDASNKDW 

WWGQIDDBEGWFPASFVRLWVNHEDEVEEGPSDVQNGHLDPNSD 
CLCLGRPL0NRD0MRANVINFTM t 5TFRWYTK'H'T VTiT nWflVT .vnr* 

RKRRDMFSDEQLKVIFGNIEDIYRFQMGFVRDLEKQYNNDDPHL 
SEIGPCFLEHQDGFWIYSEYCNNHLDACMELSKLMKDSRYQHFF 
E ACR LLQQM I D I A\ I DG FLLTPVQK I CKYPLQLAELLK YTAQDH 
SDYRYVAAALAVMRNVTQQINERKRRLENIDKIAQWQASVLDWE 
GEDI1jDRSSELIYTGEMAWIYQP\YGRNQQRVFFLFDHQMVLCK 
KDLIRRDILYYKGRIDMDKYEWDIEDGRDDDFNVSMKNAFKLH 
NKETEEIHLFFAKKLEEKIRWLRAFREERKMVQEDEKIGFEISE 
NQKRQAAMTVRKVPKQKGVNSARSVPPSYPPPQDPLNHGQYLVP 
\DGIAQSQVFEFTEPKRSQSPFWQNFSRLTPFKK 


605* ' 


43 


3356 


SGGRGPVRVRSEQLSPSAEQVSQISQISLGRRPLSSLPPPPSRA 
LAPTRAPDTALTIMEVAEVESPLNPSCKIMTFRPSMEEFREFNK 
YL A YM E S KG AHRAG LAK V IPPKEWKPRQCYDDI DNLL I P AP I QQ 
MVTGQSGLFTQYNIQKKAMTVKEFRQLANSGKYCTPRYLDYEDL 
ERKYWKNLTFVAPIYGADINGSIYDEGVDEWNIARLNTVLDWE 
EECGISI EGVNTP YLYFGMWKTTFAWHTEDMDLYS INYLHFGEP 
KSWYAIPPEHGKRLERLAQGFFPSSSQGCDAFLRHKMTLISPSV 
LKKYGIPFDKITQEAGEFMITFPYGYHAGFNHGFNCAESTNFAT 
VRW I D YGKVAKLCTCRKDMVKISMD I FVRKFQPDR YQLWKQGKD 
IYTIDHTKPTPASTPEVKAWLQRRRKVRKASRSFQCARSTSKRP 
KADEEEEVSDEVDGAEVPNPDSVTDDLKVSEKSEAAVKLRWTEA 
SS EE E S S ASRMQ VEQNLS DH I KLSGNSCLSTS VTEDI KTEDDKA 
YAYRSVPSISSEADDSIPLSTGYEKPEKSDPSELSWPKSPESCS 
SVAESNGVLTEGEESDVESHGNGLEPGEIPAVPSGERNSFKVPS 
IAEGENKTSKSWRHPLSRPPARSPMTLVKQQAPSDEELPEVLSI 
EEEVEETESWAKPLIHLWQTKPPNFAAEQEYNATVARMKPHCAI 
1 -LiJjM v i w k.f L/fa fa w b aNUAK W E T KLDE WTS EG KTKPL I P EMC F 
I YSEEN I E YS P PNAFLEEDGTS LLI S CAKCCVR VHAS CYG I PSH 
EICDGWLCARCKRNAWTAECCLCNLRGGALKQTKNNKWAHVMCA 
VAVPEVRFTNVPERTQIDVGR I PLQRLKLKCI FCRHRVKRVSGA 
CIQCSYGRCPASFHVTCAH7^AGVL\MEPDDWPYWNITCFRHKV 
NPNVKS KACEKVI S VGQTVI TKHRNTRYYSCRVMAVTSQTFYEV 
MFDDGS FSRDTFPEDI VSRDCLKLGPPAEGEWQVKWPDGKLYG 
AKYFGSNIAHMYOVEFEDGSOIAMKRFDTYTT.nPPT.DWDi/vataT? 

VS AGRCHLGTCQVNS LS S PHVS QAQQET YLGFW I NS KKSQCN I F 
LSGTY 


6057 


1 


853 


FVARLKEQEGEGGLGPRKEKGRARGRERRRKMQLTRCCFVFLVQ 
GS LYLVICGQDDGPPGS ED PERDDHEGQPRPRVPRKRGHISPKS 
R PMANS TL LGLLA P PG EAWG I LGQ P PNR PNHS P P P SAKVKK I FG 
WGDFYSNIKTVALNLLVTGKIVDHGNGTFSVHFQHNATGQGNIS 
I SLVPPS KAVEFHQEQQ IF I EAKASKI FNC\ RMEWEKVE\RGRR 

LVQKVCPDYttYHSDTPYYPSG 


6058 


1 


986 


HPLPSASLGLPSVSIiGVSLCVRSALLEAWPMLPKRRRARVGSP" 

SGDAASSTPPSTRFPGVAIYLVEPRMGRSRRAFLTGLARSKGFR 

VLDACSSEATHWMEETSAEEAVSWQERRMAAAPPGCTPPALLD 

I SWLTES LGAGQPVP VE CRHRLE VAGPS KGPLSPAWMPAYACQR 

PTPLTHHNTGLSEALEILAEAAGFEGSEGRLLTFCRAASVLKAL 

PSPVTTLSQLQGLPHFGEHSSRWQELLEHGVCEEVERVRRSE/ 

RLFTQIFGVGVKTADRWYREGLRTLDDLREQPQKLTQQQKAGEP 

SREAGPWASLNCTLDPSASTP 


6059 


2 


3650 


QQDFESLADLTDHRAHRCPGDGDDDPQLSWVASSPSSKDVASPT 
QMIGDGCDLGLGEEEGGTGLPYPCQFCDKSFIRLSYLKRHEQIH 
SDKLPFKCTYCSRLFKHKRSRDRHIKLHTGDKKYHCHECEAAFS 
RSDHLKIHLKTHSSSKPFKCTVCKRGFSSTSSLQSHMQAHKKNK 
EHLAKSEKEAKKDDFMCDYCEDTFSQTEELEKHVLTRHPQLSEK 
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[ SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine, K«Lysine, 
L=Leucine f M=Methionine, N=Asparagine, 
P=Proline , Q=Glutamine , R=Arginine , 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ADLQCIHCPEVFVDENTLLAHIHQAHANQKHKCPMCPE\QFSSV 
\EGVYCHLDSHRQPDSSNHSVSPDPVLGSVASMSSATPDSSASV 
ERGSTPDSTLKPLRGQKKMRDDGQGWTKWYSCPYCSKRDFNSL 
AVLEIHLKTIHADKPQQSHTCQICLDSMPTLYNLNEHVRKLHKN 
HAYPVMQFGNISAFHCNYCPEMFADINSLQEHIRVSHCGPNANP 
SDGNNAFFCNQCSMGFLTESSLTEHIQ\Q\AHCSVGSAKLESPV 
VQPTQSFMEVYSCPYCTNSPI FGS ILKLTKHIKENHKNIPLAHS 
KKS KAEQS P VS S DVE VS S PKRQRLSAS ANS I SNGEYPCNQCDLK 
FSNFESFQTHLKLHLELLLRKQACPQCKEDFDSQESLLQHLTVH 
YMTTSTHYVCESCDKQFSSVDD\LQKH\LLDMPHPLCCTHCT\L 
CQEVFDS \ KVS I \QVHLAVKHSNE KKMYRCTACNWDFRKE ADLQ 
VHVKHSHLGNPAKAHKCIFCX3ETFSTEVELQCHITTHSKKYNCK 
FCS KAFHAI I LLEKHLREKHCVFDAATENGTANGVPPMATKKAE 
PADLQGMLL KNPE APNS H EAS E DD VD AS P PMYRmT ccs A a v tm T7 

VLLQNHRLRDHNIRPGEDDGSRKKAEFIKGSHKCNVCSRTFFSE 
NGLREHLQTHRGPAKHYMCPICGERFPSLLTLTEHKVTHSKSLD 
TGTCRICKMPLQSEEEFIEHCQMHPDLRNSLTGFRCWCMQTVT 
STLELKI HGTFHMQKLAGS SAAS S PNGQGLQKLYKCALCLKEFR 
SKQDLVKLDVNGLPYGLCAGCMARSANGQVGGLAPPEPADRPCA 
GLRCPECSVKFESAEDLESHMQVDHRDLTPETSGPRKGTQTSPV 
PRKKTYQCI KCQMTFENE RE I Q IHVANHM I E EG I NHECKLCNQM 
FDS PAKLLCHL I EHS FEGMGGTFKCP VCFTVFVQANKLQQHI FA 
VHGQEDKIYDCSQCPQKFFFQTELQNHTMSQHAQ 


6060 


2145 


202 


SYEIVGKNKLEVNHSQLKALCKCSLPSRLLPLGENLPLLDRGFR 
KEPRSRGSRERDNMLHLHHSCLCFRSWLPAMLAVLLSLAPSASS 
D I S AS R PNI LLLMADDLG I GD I G C YGNNTMR TPN I DRLAEDG VK 
LTQHISAASLCTPSRAAFLTGRYPVRSGMVSSIGYRVLQWTGAS 
GGLPTNETTFAKILEEKGYATGLIGKWHLGLNCESASDHCHHPL 
HHGFDHFYGMPFSLMGDCARWRT, ( ?F'K'T?VMT.PnTfT,'KlT7T.irn\7T m \r 

ALTLVAGKLTHLIPVSWMPVIWSALSAVLLLASSYFVGALIVHA 
DCFLMRNHTITEQPMCFQRTTPLILQEVASFLKRNKHGPFLLFV 
S FLHVH IPLITMENFLGKS LHG L YGDNVKEMDWMVGR I LDTLDV 
EGLSNSTLIYFTSDHGGSLENQLGNTQYGGWNGIYKGGKGMGGW 
EGGIRVPGIFRWPGVLPAGRVIGEPTSLMDVFPTWRLAGSEVP 
QDRV I DG QDLL P LL LGTAQHS DHE FLMH YCER FLHAARWHQRDR 
GTMW KVHF VTP V FQ P EGAG AC YGR KVC P C FGEKWHHDPP LLFD 
LSRDPSETHILTPASE P VF YQ VMER \ VQQA VWEHQR TLS P VP LQ 
LDRLGNIWRPWLQPCCGPFPLCWCLREDDPQ 


6061 


110 


1330 


MNIHMKRKTIKNINTFENRMLMLDGMPAVRVKTELLESEQGSPN 
VHNYPDMEAVPLLLNNVKGEP PEDSLS VDHFQTQTE PVDLS INK 
ARTSPTAVSSSPVSMTASASSPSSTSTSSSSSSRLASSPTVITS 
VSSASSSSTVLTPGPLVASASGVGGQQFLHIIHPVPPSSPMNLQ 
S N KLS H VHR I P VWQS VP WYTAVRS P GNVNNT I WP LLEDGRG 
HGKAQMDPRGLSPRQSKSDSDDDDLPNVTLDSVNETGSTALSIA 
RAVQE VH P S P VS R VRGNRMNNQKFPCS ISPFSIES TRRQRTVLN 
P P DS R KTA Y S TDCD F \ EGLQQ KL YTKS S S PGRVHRRTHTGE KP Y 
KCTWEGCTWKFARSDELTRHYRKHTGVKPFKCADCDRSFSRSDH 
LALHRRRHMLV 


6062 


71 


1079 


ETMAKNGPENCEDCHILNAEAFKSKKICKSLKICGLVFGILALT " 
LIVLFWGSKHFWPEVPKKAYDMEHTFYSNGEKKKIYMEIDPVTR 
TE I FRSGNGTDETLEVHDFKNGYTG I YFVGLQKCFI KTQI KVI P 
EFSEPEEEIDENEEITTTFFEQSVIWVPAEKPIENRDFLKNSKI 
LEI CDNVTM Y W\INPTL\ I SGT FAXQLHHN FAFI I LVS ELQD FE 
EEGEDLHFPANEKKGIEQNEQMWPQVKVEKTRHARQASEEELP 
I ND YTENG I E FDPMLDERG YCC I YCRRGNR YCRRVCE P LLG Y Y P 
Y P Y C YQGGR V I CRV IMP CNWWVARM LGR V 


6063 


71 


1079 


ETMAKNGPENCEDCHILNAEAFKSKKICKSLKICGLVFGILALT 
L I VL FWG S KH FWPE VP KKAYDME HT FYS NGE KKK I YM E I DPVTR 
TE I FR SGNGTDETLE VHDFKNG YTG I YFVGLQKCFI KTQI KVI P 
EFSEPEEEIDENEEITTTFFEQSVIWVPAEKPIENRDFLKNSKI 
LEI CDNVTM YW\ I NPTL \ I SGT FAKQ LHHN FAF 1 1 LVS ELQDFE 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide""" 
(A=Alanine, OCyeteine, D=Aspartic Acid, E« 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H»Hiatidine, I=Isoleucine , K=Lysine, 
L=Leucine, [^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VaValine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, +«Stop 
Codon, /=possible nucleotide deletion, 
\=*possible nucleotide insertion) 








EEGEDLHFPANEKKGIEQNEQWWPQVKVEKTRHARQASEEELP 
INDYTENG I EFDPMLDERG YCC I YCRRGNR YCRR VCE PLLGYYP 
YPYCYQGGRVICRVIMPCNWWVARMLGRV 


6064 


913 


311 


NLPQSLPRPTEHSPPYSLEKMTDLVAVWDVALSDGVHKIEFEHG 
TTSGKRWYVDGKEEIRKEWMPKLVGKETFYVGAAKTKATINID 
AISGFAYEYTLEINGKSLKKYMEDRSKTTNTWVLHMDGENFRIV 
LE KDAMD VW CNG KKL E TAGE FVDDGTETH FS I GTH \ AC Y I KAV \ 
SSG\KRKEGIIHTLIVDNREIPEIAS 


6065 


1153 


641 


MS VR VARVAW VRGLGAS YRRGASS FPVPP PGAQGVAELLRDATG 
AEEEAPWAATERRMPGQCSVLLFPGQGSQWGMGRGLLNYPRVR 
E L YAAARR VLG YDLLE LS LHG P QE T LDRT VHCQ PAI F VAS LAAV 
EKLHHLQPSVIENCVAAAGFSVGEFAALVFAGAMEFAEG 


6066 


68 


3470 


VKENMPATRKPMRYGHTEGHTEVCFDDSGSFIVTCGSDGDVRIW 
EDLDDDDPKFI NVGE KA Y S CAL KS G KLVTAVS NNTI Q VHT F PEG 
VPDGILTRFTTNANHWFNGDGTKIAAGSSD\FLVKIVDVMDSS 
QQKTFRGHDAP VLSLS FDPKDI FLASASCDGSVRVWQ I SDQTCA 
ISWPLLQKCNDVINAKSICRLAWQPKSGKLLAIPVEKSVKLYRR 
ES WSHQFDLS DNFISQTLNI VTWS PCGQ YLAAGS INGL I IVWNV 
ETKDCMERVKHEKGYAICGLAWHPTCGRISYTDAEGNLGLLENV 
CD P S G KTS S S KVS SR VE KD YND L FDGDDM S NAGD FLNDNAVE I P 
SFSKGIINDDEDDEDLMMASGRPRQRSHILEDDENSVDISMLKT 
GSSLLKEEEEDGQEGSIHNLPLVTSQRPFYDGPMPTPRQKPFQS 
GSTPLHLTHRFMVWNS IGI IRCYNDEQDNAI DVEFHDTS IHHAT 
HLSNTLNYTIADLSHEAILLACESTDELASKLHCLHFSSWDSSK 
EW 1 1 DLPQNEDI EAICLGQGWAAAATS ALLLRL FTIGGVQKEVF 
SLAGPWSMAGHGEQLFIVYHRGTGFDGDQCLGVQLLELGKKKK 
QILHGDPLPLTRKSYLAWIGFSAEGTPCYVDSEGIVRMLNRGLG 
NTWTPICNTREHCKGKSDHYWWGIHENPQQLRCIPCKGSRFPP 
TLPRPAVAILSFKLPYCQIATEKGQMEEQFWRSVIFHNHLDYLA 
PCNGYEYEESTKNQATKEQQELLMKMIiALSCKLEREFRCVELADL 
MTQNAVNLA I KYASRSRKL I LAQKLSELAVEKAAE LTATQVEEE 
EEEEDFRKKLNAGYSNTATEWSQPRFRWQVEEDAEDSGEADDEE 
KPEIHKPGQNSFSKSTNSSDVSAKSGAVTFSSQGRVNPFKVSAS 
S KE P AMS MNS AR S TN I LDNMG KS S KKS TALSRTTNNEKS P 1 1 K P 
LIPKPKPKQASAASYFQKRNSQTNKTEEVKEENLKNVLSETPAI 
C P PQNTENQR P KTG FQMWLE ENRSN I L S DNPDF S DEAD 1 1 KEGM 
IRFRVLSTEERKVWANKAKGETASEGTEAKKRKRWDESDETEN 
QEEKAKENLNLSKKQKPLDFSTNQKLSAFAFKQE 


60*7 


858 


321 


LPWQRLGVLLSRGKMAVTGWLESLRTAQKTALLQDGRRKVHYLF 

PDGKEMAEEYDEKTSELLVRKWRVKSALGAMGQWQLEVGDPAPL 

GAGNLGPELIKESNANPIFMRKDTKMSFQWRIRNLPYPKDVYSV 

SVDQKERCIIVRTTNKKYYKKFSIPDLDRHQLPLDDALLSFA\T 
PTAP 


6Q6B 


13 


1730 


GSKMADLANEEKPAIAPPVFVFQKDKGQKSPAEQKNLSDSGEEP 
RGEAEAPHHGTGHPESAGEHALEPPAPAGASASTPPPPAPEAQL 
PPFPRELAGRSAGGSSPEGGEDSDREDGNYCPPVKRERTSSLTQ 
FPPSQSEERSSGFRLKPPTLIHGQAPSAGLPSQKPKEQQRSVLR 
PAVLQAPQPKALSQTVPSSGTNGVSLPADCTGAVPAASPDTAAW 
RSPSEAADEVCALEEKEPQKNESSNASEEEACEKKDPATQQAFV 
FGQNLRDRVKLINESVDEADMENAGHPSADTPTATNYFLQYISS 
SLENSTNSADASSNKFVFGQNMSERVLSPPKLNEVSSDANRENA 
AAESGSESSSQEATPEKESLAESAAAYTKATARKCLLEKVEVIT 
G E EAE SNVLQM QCKL F V FD KTS QS WVE RGRGLLR LNDMAS TDDG 
TLQSRLSDAGPRGSLR\LILNTKLWAQMQIDKASEK\S1RITAM 
DNEDQGVKVFLISASSKDTGQVYAALHHRILALRSRVEQEQEAK 

MPAPEPGAAPSNEEDDSDDDDVLAPSGATAAGAGDEGDGQTTGS 
T 


6069 


583 ' 


27 


PTRPGQAGSSSAMAAQRLGKRVLSKLQSPSRARGPGGSPGGLQK ' 

RHARVTVKYDRRELQRRLDVEKWIDGRLEELYRGMEADMPDEIN 

IDELLELESEEERSRKIQGLLKSCGKPVEDFIQELLAKLQGLHR 
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SEQ 
ID 
NO: 


Prpd {nf"pd 

beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
locat ion 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
uiuLdnuc ntiu, p — i'lieny j. alanine , vj^oiycine, 
H=Histidine, I«Isoleucine, K=Lysine, 
L»Leucine, M»Methionine, N=Asparagine, 
r^riuxinc , vjrj-uuciniine , K=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion] 








Q\PGLRQPSPSP\DGQPSAPFQGPGARTASPLTLLALFPGPPER 
RPALLCVLSCI 


6070 


478 


858 


IRVTVDGEFLHYIFPLQFLDSPEW/RFTETHRGRHF^QVTLTAE 
TDCR YVS WRRKKL YLL FAQHRY I S RLFS VL I GSD I ADKL YALND 
R VY I G KR YH YD I RLPN F YQ M ST P E I RR S P LTQH FQNSRR YW 


6071 


2 


1654 


HEARTKGNMALARP\VRLFSLVTRLLLAPRRGLTVRSPDEPLPV 
VRIPVALQRQLEQRQSRRRNLPRPVLVRPGPLLVSARRPELNQP 
ARLTLGRWERAPLASQGWKSRRARRDHFS I ERAQQEAPAVRKLS 
ojmjs r /\uij^AWi\.t'KVJUnAl»Utii \AAPEVVQ\PTTVQSSTIPSLLR 
GRHWCAAETGSGKTLSYLLPLLQRLLG\HPSLDSLPIPAPRGL 
VLVPSRELAQQVRAVAQPLGRSLGLIiVRDLEGGHGMRRIRLQLS 
RQP S AD VLVAT PGALWKALKS RL I S LEQLS FLVLDEADTLLDES 
FLELVDYILEKSHIAEGPADLEDPFNPKAQLVLVGATFPEGVGQ 
LLNKVASPDAVTTITSSKLHCIMPHVKQTFLRLKGADKVAELVH 
I L KHRDRAE RTG P S GT VL VFCNS S S T VN WLG Y I LDDHK I QHLRL 
QGQM P ALMR VG I F Q S FQ KS S RD I LLCTD I AS RGLD S TG VE L WN 
i L/r rrl lAJUx 1HKAGRVGRVGSEVPGTVISFVTHPWDVSLVQKI 
EliAARRRRSLPGLASSVKEPLPQAT 


6072 


1 


742 


KMERTEMMPTINSQLEFKSKPFPLVSSSRWLVKRGELTAYVEDT 
VLFSRRTSKQQVYFFLFNDVLIITKKKSEESYNVNDYSLRDQLL 
VESCDNEELNSSPGKNSSTMLYSRQSSASHLFTLTVLSNHANEK 
VEMLLGAETQSERARWITALGHSSGKPPADRTSLTQVEIVRSFT 
AKQPDELSLQVADWLI\YQRVSDGWYEGER\LRDGERGWFPME 
CAKE I TCQAT I D KNVERMGRLLGLETNV 


6073 


620 


860 


fuKKtjijAKtfjjisKK.r'Oj/ S ILVHCAVGVSRSATLVLAYLMLYHrlLT 
L VEAI KKVKDHRG I I PNRG FLRQLLALDRRLRQGLEA 


6074 


168 


1110 


PG AR CMATE LQ C PD S M P CHNQQ VNS AS T P S PEQL R PGD L I L DHA 
GGNRAS RAKV I L LTG YAHS S L PAELDSGACGG S S LNS EGNS G S G 
DSSSYDAPAGNSFLEDCELSRQIGAQLKLLPMNDQIRELQTIIR 
DKTASRGDFMFSADRLIRLWEEGLNQLPYKECMVTTPTGYKYE 
\j v nx n v.. (a v to i pi Kou c, AMh y G LiRD CCR S I R IG K I L IQSDEET 
QRAKVYYAKFPPDIYRRKVLLMYPILQTG\NTVIEAVKVLIEHG 
VQPS VI I LLSLFS T PHGAKS 1 1 QEF PE I T I LTTE VH P VAPTHFG 
QKYFGTD 


6075 


3 20 


1091 


PPTCQPQEVEHH\YGYVPILGNKTLPSRCHQCVIVSSSSHLLGT " 

KLGPEIERAECTIRMNDAPTTGYSADVGNKTTYRWAHSSVFRV 

LRRPQEFVNRTPETVFIFWGPPSKMQKPQGSLVRVIQRAGLVFP 

NMEAYAVSPGRMRQFDDLFRGETGKDREKSHSWLSTGWFTMVIA 

VELCDHVHVYGMVPPNYCSQRPRLQRMPYHYYEPKGPDECVTYI 

QNEHSRKGNHHR F I TE KRVFSS WAQLYG I TFSHPS WT 


6076 


1721 


107 


hpspteaprvqhltmdctwrilflVaaatgthaqvOlvqsgaev 

KKPGASVKVSCKVSGYTLTELSMHWVRQAPGKGLEWMGAFDPED 
GET I YAQKFQGRVTMTEDTSTDTAYMELSSLRS EDTAVY YCATD 
HGDYAFDIWGQGTMVTVSSAPTKAPDVFPIISGCRHPKDNSPW 
LACLITGYHPTSV\TVTWYMGTQSQA\QRTFPEIQRRDSYYMTS 
SQLSTPLQQWRQGEYKCWQHTASKSKKEIFRWPESPKAQASSV 
PTAQPQAEGSLAKATTAPATTRNTGRGGEEKKKEKEKEEQEERE 
TKTPECPSHTQPLGVYLLTPAVQDLWLRDKATFTCFWGSDLKD 
AHLTWEVAGKVPTGGVEEGLLERHSNGSQSQHSRLTLPRSLWNA 
GTSVTCTLNHPSLPPQRLMALREPAAQAPVKLSLNLLASSDPPE 
A\ASWLLCE VSGFSPPMT r.TiMWT.pr>HnpVNT«?nPAPap dt\dk"d\ 
RS TT F WA \ WS VLR VP AP PS PQ P AT YTCW S HE DS RTLLNAS RS L 
EVSYVTDHGPMK 


6077 


3687 


1268 


LLPDMNLQPIFWIGLISSVCCVFAQTDENRCLKANAKSCGECIQ ' 

AGPNCGWCTNSTFLQEGMPTSARCDDLEALKKKGCPPDDIENPR 

GSKDIKKNKNVTNRSKGTAEKLKPEDITQIQPQQLVLRLRSGEP 

QTFTLKFKRAEDYPIDLYYLM\DLSYSMKDDLENVKSLGTDLMN 

EMRRITSDFRIGFGSFVEKTVMPYISTTPAKLRNPCTSEQNCTS 

PFSYKNVLSLTNKGEVFNELVGKQRISGNLDSPEGGFDAIMQVA 

VCGSLIGWRNVTRLLVFSTDAGFHFAGDGKLGGIVIjPNDGQCHL 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H»Hietidine, I»Isoleucine , K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine, 
P=Prol ine, Q=Glutamine , R=Arginine , 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ENNMYTMSHYYDYPSIAHLVQKLSENNIQTIFAVTEEFQPVYKE " 
LKNLIPKSAVGTLSANSSNVIQLIIDAYNSLSSEVILENGKLSE 
G VT I S YQS Y \ CKNG VNG TGENGRKCSNI S I GDEVQFE I S I TS NK 
CPKKDSDSFKIRPLGFTEEVEVI LQY I CECECQSEG I PESPKCH 
EGNGTFECGACRCNEGRVGRKCECSTDEVNSEDIGCFTARKENQ 
FQKSASNHGRVPSAGQCVCRKRDNTNEIYSGKFCECDNFNCDRS 
NG L I CGGNGVCKCRVCE CN PNYTGS ACD CS LDTSTCEAS NGQ I C 
NGRGICECGVCKCTDPKFQGQTCEMCQTCLGVCAEHKECVQCRA 
FNKGEKKDTCTQECSYFNITKVESRDKLPQPVQPDPVSHCKEKD 
VDDCWFYFTYSVNGNNEVMVHWENPECPTGPDIIPIVAGWAG 
I VL I G LALLL I W KLLM I I HDRR E F AKFE KE KMNAKWDTG ENP I Y 
KSAVTTWNPKYEGK 


6078 


1426 


180 


ETEDVMELLEEDLTCPICCSLFDDPRVLPCSHNFCKKCLEGILE 
GSVRNSLWRPVPFKCPTCRKKTFSYWELIPLQVNYSLKGIVEKY 
NKIKISPKMPVCKGH\LGQPLNIF\CL\TDMQLDL/CGIC\ATR 
GEHTKHVFCSIEDAYAQERDAFESLFQSFETWRRGDALSRLDTL 
ETSKRKSLQLLTKDSDKVKEFFEKLQHTLDQKKNEILSDFETMK 
LAVMQA YD P E I NKLNT I LQE QRMA FN I AE A F KD VS E P I V FLQQM 
QEFREKI KVI KETPLP PSNLPAS PLMKNFDTSQWEDI KLVDVDK 
LSLPQDTGTFISKIPWSFYKLFLLILLLGLVIVFGPTMFLEWSL 
FDDLATWKGCLSNFSSYLTKTADFIEQSVFYWEQVTDGFFIFNE 
RFKNFTLWLNNVAEFVCKYKLL 


6079 


1586 


141 


ATARDLGCARRIDRWMESTPSRGLNRVHLQCRNLQEFLGGLSP 
GVLDRLYGHPATCLAVFRELPSLAKNWVMRMLFLEQPLPQAAVA 
LWVKKEFSKAQEESTGLLSGLRIWHTQLLPGGLQGLILNPIFRQ 
NLRIAIjLGGGKAWSDDTSQLGPDKHARDVPSLDKYAEERWEWL 
HFMVGSPSAAVSQDLAQLLSQAGLMKSTEPGEPPCITSAGFQFL 
LLDTPAQLWYFMLQYLQTAQSRGMDLVEILSFLFQLSFSTLGKD 
YSVEGMSDSLLNFLQHLREFGLVFQRKRKSRRYYPT/RALAINL 
SSGVSGAGGTVHQPGFIV\VETNYRLYAYTESELQIALIALFSE 
MLYPFP\NMW\ARVTR\ESVQQAIASGITAQQIIHFLRTRAHP 
VMLKQTPVLPPTITDQIRLWELERDRLRFTEGVLYNQFLSQVDF 
ELL \ LAHAPKLG VLVFE /NTPAKRLMWTPAGHS DVKR F WKRQK 
HSS 


6080 


1 


1199 


IETIDHVGEFAMAAQAAGVSRQRAATQGLGSNQNALKYLGQDFK 
TLRQQCLDSGVLFKDPEFPACPSALGYKDLGPGSPQTQGIIWKR 
PTELCPSPQFIVGGATRTDICQGGLGDCWLLAAIASLTLNEELL 
YRWPRDQDFQENYAG I FHFQPLCPPS P\FWQYGE WVEWI DDR 
LPTKNGQLLFLHSEQGNEFWSALLEKAYAKLNGCYEALAGGSTV 
EGFEDFTGGISEFYDLKKPPANLYQIIRKALCAGSLLGCSIDVY 
SAAEAEAI TSQKLVKSHAYS VTGVEE VNFQGHPEKL IRLRNPWG 
EVEWSGAWSDDAPEWNHIDPRRKEELDKKVEDGEFWMSLSDFVR 
QFSRLE I CNLS PDSLSSEEVHKWNLVLFNGHWTRGSTAGGCQNY 
PGSS 


GlOBl 


3 


865 


EMLPLLLPLPLLWA/GALAQDARFRLEMPESVTVQEGLCIFVHC 
SVFYLEYGWKDSTPAYGHWFREGVSVDQETPVATNNSTQKVQKE 
TQGR FHLLGDPS RNNCSLS IRDARRRDNGS YFFW VARGRTKFS Y 
KYSPLSVYVTALTHRPDILIPEFLKSGHPSNLTCSVPWVCEQGT 
PPIFSWMSAAPTSLGPRTLHSSVLTI I PRPQDHGTNLI CQVTFP 
GAGVTTERTIQLSVSWKSGTVEEVWLAVGWAVKILLLCLCLI 
I LS FH KKKAVRAVE VS ENVYAVMG 


6082 


283 


jl <& a o 


b'AKS t\j fTUIKXAFUIjAAFGliAQPAALRLLLSR P PSAAMDGDGD 
PESVGQPEEASPEEQPEEASAEEERPEDQQEEEAAAAA\Y\LDE 
LPEPLLA/ LRVLAALPRHE \ LVQACR\ LVCLRWKELVDGAPLWL 
L KCQQEGL VP EGG VEE ERDH WQ QF Y FLS KRRRNLLRNP CGEEDL 
EGWCDVEHGGDGWRVEELPGDSGVEFTHDESVKKYFASSFEWCR 
KAQ VI DLQAEG YWEELLDTTQPAI WKDW YS GRS DAGCLYELTV 
KLLSEHENVLAEFSSGQVAVPQDSDGGGWMEISHTFTDYGPGVR 
FVRFEHGGQDSVYWKGWFGARVTNSSVWVEP 


6083 


1865 


309 


KOWCAERRGLGMSLADELLADLEEAAEEEEGGSYGEEEEEPAIE 
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1 SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
. amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amxno acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=»Glycine, 
H=Histidine, I=Isoleucine, K«Lysine, 
L=Leucine, M-Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S«Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine f X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DVQEETQLDLSGDS VKTI AKLWDS KMFAE I MMKI EE Y I SKQAKA 
S E VMG P VE AAPE Y RV I VDANNLT VE I ENE LN T T H K P T P OK Y <? vp 

FPELESLVPNALDYIRTVKELGNSLDKCKNNENLQQILTNATIM 
WS VTAS TTQGQQLS E E ELERLEEACDHALE LNAS KHR I YE YVE 
SRMSFIAPNLSI I IGASTAAKIMGVAGGLTNLSKMPACNIMLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\DL 
RRKAARLVAAKCTLAARVDS FHESTEGKVG YELKDE I ERKFDKW 
QE P P P VKQVKPLPAP LDGQRKKJRGGRRYRKMKERLGLTE I R \ KQ 
ANRMSFGEIEEDAYQEDLGFSLGHLGKSGSGRVRQTQVNEATKA 
RISKTLQRTLQKQSWYGGKSTIRDRSSGTASSVAFTPLQGLEI 
VNPQAAEKKVAEANQKYFSSMAEFLKVKGEKSGLMST 


<?084 


1865 


309 


KQWCAERRGLGMSLADELLADLEEAAEEEEGGSYGEEEEEPAIE 
DVQEETX3LDLSGDS VKTI AKLWDS KMFAE I MMKI EEY I SKQAKA 
SEVMGPVEAAPEYRVIVDANNLTVETPNPT.WT TUVPTonvvcvD 

F P E LE S L VPNALDY I RT VKELGNS LD KG KNN ENLQQ I LTNAT I M 
WSVTASTTQGQQLSEEELERLEEACDMALELNASKHRIYEYVE 
S RMS FIAPNLS 1 1 IGASTAAKI MGVAGGLTNLS KMPACNI MLLG 
AQRKTLSGFSSTSVLPHTGYIYHSDIVQSLPPIPPPFSVAP\DL 
RRKAARLVAAKCTLAARVDS FHES TEGKVG YELKDE I ERKFDKW 
QEPPPVKQVKPLPAPLDGQRKKRGGRRYRKMKERLGLTEIR\KQ 
ANRMS FGE I EEDAYQEDLG FSLGHLGKSGSGRVRQTQVNEATKA 
RISKTLQRTLQKQSWYGGKSTIRDRSSGTASSVAFTPLQGLEI 
VN PQAAE KKVAE ANQ KY FS S MAE FLKVKG E KSG LMST 


6085 


2 


1456 


SGPRSFQGNRAVGRISLGGKRNPEVTLLPGVSSERVRRWRRARV 
GVARVKPGNPWKPSPATQVPR/VPAQVYLPGRGPPLREGEELVM 
1 v Lt x. nitfty 1 (j/ifCuo r U X vtiuti LuUNKTELPLTL YLCAGT 
QAESAQSNRLMMLRMHNLHGTKPPPSEGSDEEEEEEDEEDEEER 
KPQLELAMVPHYGGINRVRVSWLGEEPVAGVWSEKGQVEVFALR 
RLLQWEEPQALAAFLRDEQAQMKPIFSFAGHMGEGFALDWSPR 
VTGRLLTGDCQKNIHLWTPTDGGSWHVDQRPFVGHTRSVEDLQW 
S P TENTVFAS CSADAS I R I WD I RAA PS KACMLTTATAHDGDVNV 
ISWSRREPFLLSGGDDGALKIWDLRQFKSGSPVATFKQHVAPVT 
S VE WH PQDSG VFAAS G ADHQ I TQWDLG / 1 VERD P EAGDVEAD PG 

LADLPQQLLFVHQGETELKELHWHPQCPGLLVSTALSGFTIFRT 
ISV 


6086 


2419 


1357 


GAATQHGGAMNLLPCNPHGNGLLYAGFNQDHGCFACGMENGFRV 
YNTD V L KE KEKQE FL EGG VGHVEML FRCN YLALVGGG KK P KYP P 
NKVMIWDDLKKKTVIEIEFSTEVKAVKLRR\DKIVWLDSMIKV 
FTFTHN P \ HQLHV FE \ TC YNP KGLCVLC PNS NNS L LAFPGTHTG 
HVQLVDLASTEKPPVDIPAHEGVLSCIALNLQGTRIATASEKGT 
LIRIFDTSSGHLIQELRRGSQAANIYCINFNQDASLICVSSDHG 
TVHIFAAEDPKRNKQSSLASASFLPKYFSSKWSFSKFQVPSGSP 
CICAFGTEPNAVIAICADGSYYKFLFNPKGECIRDVYAQFLEMT 
DDKL 


6087 


476 


1877 


QNS QRTG LP I T I FS RS F PL L TGS DLCENMPCTCT WRN WRQW I RP " 

LVAVIYLVSlVVAVPLCVWKT.nVT.FVnTWTK'&WTJT iptpt t t ikt 

P I S L W V I LQHL VH YTQ PE LQ KP I I R I LWMVP I YS LDS W I ALKY P 
GIAI YVDTCRECYEAYVI YNFMGFLTNYLTNRYPNLVLI LEAKD 
QQKHFPPLCCCPPWAMGE^LFRCKLGVLQYTVVRPFTTIVALI 
CELLG I YDEGNFS FSNAWTYLVI INNMSQL FAM YCLLLFYKVLK 
EELSPIQPVG KFLCVKLW FVS FWQ A W I ALL VKVGVI S E KHT W 
EWQTVEAVATGLQDFIICIEMFLAAIA\HHYTFSYKPYVQEAEE 
GS CFDS FLAM WDVSD I RDD I S EQ VR HVGRT VRGHPR KKL FPEDQ 
DQNEHTSLLSSSSQDAISIASSMPPSPMGHYQGFGHTVTPQTTP 
TTAKISDEILSDTIGEKKEPSDKSVDS 


6088 


1684 


689 


GASGLVRLLQQGHRCLLAPVAPKLVPPVRGVKKGFRAAFRFQKE 
LERQRLLRCPPPPVRRSEKPNWDYHAEIQAFGHRLQENFSLDLL 
KTAFVNS C YI KS EEAKRQQLG I EKEAVLLNLKSNQELSEQGTS F 
SQTCLTQFLEDEYPDMPTEGIKNLVDFLTGEEWCHVAliNLAVE 
QLTLSEEFPVPPAVLQQTFFAVIGALLQSSGPERTALFIRDFLI 
TQMTGKELFEMWKIINPMGLLVEELKKRNVSAPESRLTRQSG\A 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aepartic Acid, E= 
vjj-uctdiiii u rtwia, r =r , iienyj.axanine / G^Glycine, 
H-=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine / N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PTALPLYFVGLYCDKKLIAEGPGETVLVAEEEAARVALRKLYGF 
TENRRPWNYS KPKETLRAEKS I TAS 


6089 


3 


3054 


TRLGIPGSTISSRPRLCALAAEGHFliGHSWTGSRAGAHTGAPAW 
PSRRLRDLPAGGMWRLRRAAVACEVCQSLVKHSSGIKGSLPLQK 
LHLVSRSIYHSHHPTLKLQRPQLRTSFQQFSSLTNLPLRKLKFS 
PIKYGYQPRRNFWPARLATRLLKLRYLILGSAVGGGYTAKKTFD 
QWKDMIPDLSEYKWIVPDIVWEIDEYIDFEKIRKALPSSEDLVK 
LAPDFDKIVESLSLLKDFFTSGSPEETAFRATDRGSESDKHFRK 
VSDKEKIDQLQEELLHTQLKYQRILERLEKENKELRKLVLQKDD 
KGIPFIESLRKSLID M YS E VL D VLS D YDAS YNTQDH L P R WWG 
DQSAGKTSVLEMIAQARIFPRGSGEMMTRSPVKVTLSEGPHHVA 
LFKDSSREFDLTKEEDLAALRHEIELRMRKNVKEGCTVSPETIS 
LNVKGPGLQRM VLVDLPG VINTVTSGMAPDTKETI FSIS KAYMQ 
DPNAIILCIQDGSVDAERSIVTDLVSQMDPHGRRTIFVLTKVDL 
AEKNVASPSRIQQIIEGKLFPMKALGYFAWTGKGNSSESIEAI 
REYEEEFFQNSKLLKTSMLKAHQVTTRNLSLAVSDCFWKMVRES 
VEQQADSFKATRFNLETEWKNNYPRLRELDRNELFEKAKNEILD 
EVISLSQVTPKHWEEILQQSLWERVSTHVIENIYLPAAQTMNSG 
TFNTTVDIKLKQWTDKQLPNKAVEVAWETLQEEFSRFMTEPKGK 
EHDD I FDKLKEAVKEKS I KRHKWNDFAEDSLRVIQHNALEDRS I 
SDKQQWDAAIYFMEEALQARLKDTENAIENMVGPD\WKKRWLYW 
KNRTQEQCVHNETKNELEKMLKCNEEHPAYLASDEITTVRKNLE 
SRGVEVDPSLIKDTWHQVYRRHFLKTAIiNHCNLCRRGFYYYQRH 
FVDSELECNDWLFWRIQRMLAITANTLRQQLTNTEVRRLEKNV 

KEVLEDFAEDGEKKIKLLTGKRVQLAEDLKKVREIQEKLDAFIE 
ALHQEK 


6090 


194 


1560 


PVFVPAPGAVLEQAS/ASPPLATQTWPLQHCKIPELPVQASIL 
FELQLFFCQLIAXjFVHYINIYKTVWWYPPSHPPSHTSLNFHLID 
FNItLMVTT I VLGRRFIGS I VKEASQRGKVSLFRS ILLFLTRFTV 
LTATGWSLCRSLIHLFRTYSFLNLL/FPLLSVWDVHSVPAAELR 
P \ RKTS L FNHMAS MGPRE AVS GLAKS RD YLLTLR \ RRGS STQDS 
CMART P C P / PHAC CLS PS L I RS EVE FLKMDFNWRMKE VLVS SML 
SAYYVAFVPVWFVKNTHYYDKRWSCELFLLVSISTSVILMQHLL 
PAS Y CD LLH KAAAHLG CWQ KVD PAL C SNVLQH P WT E E CM W PQG V 
LVKHSKNVYKAVGHYNVAIPSDVSHFRFHFFFSKPLRILNILLL 
LEGAVIVYQLYSLMSSEKWHQTISLALILFSNYYAFFKLLRDRL 
VLGKAYS YSAS PQRDLDHR FS 


6091 


3279 


412 


SSRTREMEEKEILRRQIRLLQGLIDDYKTLHGNAPAPGTPAASG 
WQPPTYHSGRAFSARYPRPSRRGYSSHHGPSWRKKYSLVNRPPG 
PSDPPADHAVRPLHGARGGQPPVPQQHVLERQVQLSQGQNWIK 
VKPPSKSGSASASGAQRGSLEEFEDTPWSDQRPREGEGEPPRGQ 
LQPSRPTRARGTCSVSDPLLVCQKEPGKPRMVKSVGSVGDSPRE 
PRRTVSESVIAVKASFPSSALPPRTGVALGRKLGSHSVASCAPQ 
LLGDRRVDAGHTDQP VPSGS VGG PARPASGPRQAREAS LWTCR 
TNKFRKNN YICWVAAS S KS PRVARRALSPRVAAENVCKASAGMAN 
KVEKPQLIADPEPKPRKPATSSKPGSAPSKYKWKASSPSASSSS 
SFRWQSEAGSKDHASQLSPVLSRSPSGD\RPAVGHSGLKPLSGE 
TPLSAYKVKSRTKIIRRRGSTSLPGDKKSGTSPAATAKSHLSLR 
RRQALRGKSSPVLKKTPNKGLVQVTTHRLCRLPPSRAHLPTKEA 
SSLHAVRTAPTSKVIKTRYRIVKKTPASPLSAPPFPLSLPSWRA 
RRLiSLSRSLVLNRLRPVASGGGKAQPGSPWWRSKGYRCIGGVLY 
KVSANKLSKTSGOPSDAGSRPLLRTGRLDPAGSCSRqT.AQpavn 
RSLAIIRQARQRREKRKEYCMYYNRFGRCNRGERCPYIHDPEKV 
AVCTRFVRGTCKKTDGTCPFSHHVSKEKMPVCS YFLKG I CSNSN 
CPYSHVYVSRKAEVCSDFLKGYCPLGAKCKKKHTLLCPDFARRG 
AC PRGAQ CQL LHRTQ KRHS RRAATS PAPG P S DATAR S R VS AS HG 
PRKPSASQRPTRQTPSSAALTAAAVAAPPHCPGGSASPSSSKAS 
SSSSSSSSPPASLDHEAPSLQEAALAAACSNRLCKLPSFISLQS 
S P S PGAQ PRVRAPRAP LTKDSGKPLH IKPRL 


6092 


143 


3190 


AKAPPTGESSEPEAKVLHTKRLYRAWEAVHRLDLILCNKTAYQ 
E VFKPEN I S LRNKLRELCVKLM FLHP VD YGRKAEELL WRKVY YE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=spossible nucleotide insertion) 








viqliktnkkhihsrstlecayrthlvagigfyqhlllyiqshy 
qlelqccidwthvtdpligckkpvsasgkemdwaqmachrclvy 
lgdlsryqneiagvdtellaerfyyqalsvapqigmpfnqfcgtl 
agskyynveamycylrciqsevsfegaygnlkrlydkaakmyhq 
lkkcetrklspgkkrckdikrllvnfmylqsllqpksssvdsel 
tslcqsvledfnlclfylpsspnlslasedeeeyesgyaflpdl 
li fqmvi iclmcvhslieragskqysaaiaftlaiifshlvnhvni 
rlqaeleegenpvpafqsdgtdepeskepvekeeepdpepppvt 
pqvgegrksrkfsrlsclrrrrhppkvgddsdlsegfesdsshd 
sarasegsdsgsdkslegggtafdaetdsemnsqesrsdledme 
eeegtrsptlepprgrseapdslngplgpseasiasnlqamstq 
mfqtkrcfrlaptfsnlllqpttnphtsashrpcvngdvdkpse 
paseegsesegsessgrscrnersiqeklqvlmaegllpavkvf 
ldwlrtnpdliivcaqssqslwnrlsvllnllpaagelqesgla 
lcpevqdllegcelpdlpsslllpedmalrnlpplraahrrfnf 
dtdrpllstleeswriccirsfghfiarlqgsilqfnpevgif 
vs i aqs eqesllqqaqaqfrmaqeearrnrlmrdmaqlrlqlev 
sqlegslqqpkaqsamspylvpdtqalchhlpvirqlatsgrfi 
viiprtvidgldllkkehpgardgiryleaefkkgnryircqke 
vgksferhklkrqdadawtlykildsckqlt\laqgageedpsg 
mvtiitglpldnpsllsgpmqaalqaaahasvdiknvldfykqw 

KEIG 


6093 


76 


1002 


acgrranlalrvart/srwgal\rgavwapgtrpskrracwall 
ppvpcclgclaerwrlrpaalglrlpgigqrnhcsgagkaapr\ 
paagagaaaeapggqwgpastpslyenpwtipnmlsmtriglap 
vlgyliieedfnialgvfalagltdlldgfiarnwanqrsalgs 
aldpladki li s i lyvsltyadli pvpltym i i srdvmli aavf 
yvryrtlptprtlakyfnpcyatarlkptfiskvntavqlilva 

ASLAAPVFNYADSIYLQILWCFTAFTTAASAYSYYHYGRKTVQV 
IKD 


6094 


23 


1010 


P FLR CLRGDQKAKMS ER K VLN K Y Y P PDFD P S K I P KLKL PKDRQ Y 
WRLMAPFNMRCKTCGEYIYKGKKFNARKETVQNEVYLGLPIFR 
FYIKCTRCLAEITFKTDPENTDYTMEHGATRNFQAEKLLEEEEK 
RVQKEREDEELNNPMKVLENRTKDSKLEMEVLENLQELKDLNQR 
QAHVDFEAMLRQHRLSEEERRRQQQEEDEQETAALLEEARKRRL 
LEDSDSEDEAAPSPLQPALRPNPTAILDEAPKPKRKVEVWEQSV 
GSLGSRPPLSRLVWKKAKADPDCSNGQPQA/APHPRSPAEQEG 
GQPYTPDAWRVLPEPTGCIPGQ 


6095 


1 


1599 


TRGRAAERSRGRGHGFLGGGFA\SWDYFPSEDFYRCGYCKNES 
GSRSNGMWAHSMTVQDYQDLIDRGWRRSGKYVYKPVMNQTCCPQ 
YTIRCRPLQFQPSKSHKKVLKKMLKFLAKGEVPKGSCE\DEPMD 
STMDDAVAGDFALINKLDIQCDLKTLSDDIKESLESEGKNSKKE 
EPQELLQSQDFVGEKLGSGEPSHS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding . 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, £ s 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=*Arginine, 
S^Serine, T=Threonine, V^Valine, 
W«Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








VKVHTVPKPGKGADLSKPPCRKAKEIRKERKRLKLMQQNPAGEL 
EGFQAQGHPPSLFPPKAKSNQPKSLEDLIFESLPENASHKLEVR 
WRSSPPSSQFKATLLESYQVYKRYQMVIHKNPPDTPTESQFTR 
FLCSSPLEAETPPNGPDCGYGSFHQQYWLDGKIIAVGVIDILPN 
CVSSVYLYYDPDYSFLSLGVYSALREIAFTRQLHEKTSQLSYYY 
MGFYIHSCPKMKYKGQYRPSDLLCPETYVWVPIEQCLPSLENSK 
YCRFNQDPEAVDEDRSTEPDRLQVFHKRAIMPYGVYKKQQKDPS 
E E AAVLQ Y AS L VGQ KCS E RML LFRN 


6096 


2277 


575 


QRVRAALLSSAMEDSEALGFEHMGLDPRLLQAVTDLGWSRPTLI 
QEKAI PLALEGKDLLARARTGSGKTAAYAT PMIjOLliT.HRK-aTr d 
WEQ AVRG LVLV P TKE LARQAQSM I QQLATY CARD VR VANVS AA 
EDSVSQRAVLMEKPDVWGTPSRILSHLQQDSLKLRDSLELLW 
DEADLLFSFGFEEELKSLLCHLPRIYQAFLMSATFNEDVQALKE 
LILHNPVTLKLQESQLPGPDQLQQFQWCETEEDKFLLLYALLK 
LSLIRGKSLLFVNTLERSYRLRLFLEQFSIPTCVLNGBLPLRSR 
CHIISQFNQGFYDCVIATDAEVLGAPVKGKRRGRGPKGDKASDP 
EAGVARGIDFHHVSAVLNFDLPPTPEAYIHRAGRTARANNPGIV 
LTFVLPTEQFHLGKIEELLSGENRGPILLPYQFRMEEIEGFRYR 
CRDAMRS VTKQA IREARLKE I KEELLHSEKLKTYFEDNPR \ DLQ 
LLRHDLPLHPAWKPHLGHVPDYLVPPALRGLVRPHKK\GRSCL 
PLVGRPREQSPRTHCAASSTKERNSDPQPSPPEWGPLWS 


6097 


1673 


192 


APGTMSGGKKKS S FQXT < ?VTT > nVB , n Prig pnacnD dtdhd ptv^pp — 
PRL PNGE PS PD P GG KGTPRNGS PP PG AP S SR FR WKLPHGLG E P 
YRRGRW TCVDVYERDLEPHS FGGLLEG I RGASGGAGGRSLDSRL 
ELASLGLGAPTPPSGLSQGPTSWLRPPPTSPGPQARSFTGGLGQ 
LWPSKAKAEKPPLSASSPQQRPPEPETGESAGTSRAATPLPSIi 
R VEAE AGG SG ART P P LSRR KAVDMRLRME LGAPE EMGQ VP P LDS 
R PS S PAL Y FTHDAS LVHKS PD P FGAVAAQ KFS LAH S MLA I S GHL 
DSDDDSGSGSLVGIDNKIEOAMriT,Vl('C'UT.Mi?n\7DTrirT7T?\7T xr-cr\T 

RELAERNAALEQENGLLRALA\ SPEQLGS AGPPRGVPR\ LGPPA 
PNGPFVLSLPSLTIVPLGLPGLASAAWPPLPMPALIVPVFPGVG 
VQALSNGPWSPGPLPHLLIIPSLDGGGEGFRTGRQQGAPFGEET 
QPPPSLPGTPQQ 


609B 


168 


1074 


NYCLRHRS PLE KDS S PGSS S TSLLI KKQRETSDTP I MRALKELD 
EGKIFKNWGTQTEKEDTSNINPRQTETSVNASRSPEKCAQQRQK 
RLNSASQRSSSLPPSNRKSSTPTKREIMLTPVTVAY^PKRciPTfF 
NLSPGFSHLLSKNESSPIRFDILLDDLDTVPVSTLQRTNPRKQL 
\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 
TAWEKNKSVSYEQCKPVSVTPQGNDFEYTAKIRTLAETERFF\D 
ELTKEKDQ I EAALS RMPSPGGR I TLQTRLNQEAFGRS FGKD 


6099 


168 


1074 


NYCLRHRSPLEKDSSPGSSSTSLL I KKQRETSDTP I MRALKELD 
EGKIFKNWGTQTEKEDTSNINPROTETSVNASRSPEKCAnnRnK" 
RLNSASQRSSSLPPSNRKSSTPTKREIMLTPVTVAYSPKRSPKE 
NLSPGFSHLLSKNESSPIRFDILLDDLDTVPVSTLQRTNPRKQL 
\QFLPLDDSEEK\TYSEKAT\DNIVNHSSCPEPVPNGVKKVSVR 
TAWE KNKS VS YEQ CK P VS VTP QGNDFE YTAK I RTLAE TERFF \ D 
E LTKE KDQ I EAALS RM P S PGG R I TLQTRLNQ EAFGRS FG KD 


6100 


2 


713 


FVE VSG YRSRADPEPRGRDTMTYAYLFKYI I IGDTGVGKSCLLL 
Q FTD KR FQ P VHDLT I G VE FGARM VN I DGKQ I KLQI WDTAGQE S F 
RS I TRS Y YRGAAGALL VYDITR RETFNHLTS WLEDARQHS S SNM 
VIMLIGNKSDLESRRDVKREEGEAFARE\HGLIFMETSAKTACN 
VEEAFINTAKEIYRKIQQGLFDVHNEANGIKIGPQQSISTSVGP 
SASQRNSRD IGSNSGCC 


6101 


1 I 


1399 


FRGRAWPLREVSHWLGCRRVCSWSASWGPvLPALSARLSPLLAFR 
GKMVFPLSCAVQQYAWGKMGSNSEVARLLASSDPLAQIAEDKPY 
AE LWMGTHP RGDAK I LDNR I SQ KTLS Q W I AENQD S LGS KVKDT F 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal pepti<3e~ 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T^Threonine , V«Valine, 
W»Tryptophan, Y«=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NGNLPFLFKVLSVETPLSIQAHPNKELAEKLHLQAPQHYPDANH 
K P EMA I ALT P FQG LCG FRP VE E I VTFLKKVP E FQFL I GDE AATH 
LKQTMSHDSQAVASSLQSCFSHLMKSEKKVWEQLNLLVKRISQ 
QAAAGNNMEDIFGELLLQLHQQYPGDIGCFAIYFLNLLTLKPGE 
AMFLEANVPHAYLKGDCVECMACSDNTVRAGLTPKFIDVPTLCE 
MLSYTPSSSKDRLFLPTRSQEDPYLSIYDPPVPDFTIMKA\BVP 
G\SVTEYKDLALDSASILLMVQGTVIASTPTTQTPIPLQRGGVL 
FIGANESVSLKLTEPKDLLIFRACCLL 


6102 


70 


2415 


QTPQATLAANGAEDSRGGEMLPAGEIGAS PAAPCCSESGDERKN 
LE E KS D I N VT VLIGS KQ VS EGT DNGDL P S YVS AF I E KE VGNDL K 
SLKKLDKLIEQRTVSKMQLEEQVLTISSEIPKRIRSALKNAEES 
KQFLNQFLEQETHLFSA I NSHLLTAQPWMDDLGTM I SQ I EE I ER 
HLAYLKWISQIEELSDNIQQYLMTNNVPEAASTLVSMAELDIKL 
QESSCTHLLGFMRATVKFWHKILKDKLTSDFEE1LAQLHWPFIA 
PPQSQTVGLSRPASAPFT Y^YTiFTT.FpnT.T.KT.nTGUJ?! t nntPDV\ 

HSQKNTLFLPPLLSS/WPIQVMLTPLQKRFRYHFRGNRQTNVLS 
KPEWYLAQVLMWIGNHTEFLDEKIQPILDKVGSLVNARLEFSRG 

lmmlvleklatdipcllyddnlfchlvdevllferelhsvhgyp 
gt fas cmh i ls e etcfqrwltverkfalqkmds mlss eaawvsq 
ykditdvdemkvpdcaetfmtlllvitdryknlptasrklqfle 
lqkdlvddfrirltqvmkeetraslgfrycailnavnyistvla 
dwadnvfflqlqqaalevfaenntlsklqlgqlasmessvfddm 
inllerlkhdmltrqvdhvfrevkjdaaklykkerwlslpsqseq 
avmslsssacpllltlrdhllqleqqlcfslfkifwqmlvekld 
vyiyqeiilanhfneggaaqlqfdmtrnlfplfshyckrpenyf 
khi keaci vlnlnvgsaltagkdvlpvqlqgs fpat 


6103 


207 


2523 


ESNSTMTTYLEFIQQNEERDGVRFSWNVWPSSRLEATRMWPVA 
ALFTPLKERPDLPPIQYEPVLCSRTTCRAVLNPLCQVDYRAKLW 
ACN F C YQRNQ FP P S Y AG I S E LNQ P AE LLPQFSSIEY WLRG PQM 
PLIFLYWDTCMEDEDLQALKESMQMSLSLLPPTALVGLITFGR 
MVQVHELGCEGISKSYVFRGTKDLSAKQLQBMLGLSKVPVTQAT 
RGPQVQQPPPSNRFLQPVQKIDMNLTDLLGELQRDPWPVPQGKR 
PLRSSGVALS IAVGLLECTFPNTfiAR TMMPTnfiDaTnrDrM\n;ri 
DEL KTP I RS WHD I DKDNA K Y VK KG T KH FEALANRAATTGHV I D I 
YACALDQTG LLE M KCC PNL TGG YMVMGD S FNTS LF KQT FQR VFT 
KDMHGQFKMGFGGTLEIKTPR\EIKISGAIGPCVSLNSKGPCVS 
ENE I GTGGTCQ WKI CGLS PTTTLAI YFE WNQHNAP I PQGG \ RG 
A\ IQFVTQ Y\ QHS S GQRR I R VTT I ARN \ WADAQTQI QN I AAS FD 
QEAAA I LMAR LA I YRAETEEGPD VLR WLDRQL IRLCQKFGE YHK 
DDPSSFRFSETFSLYPQFMFHLRRSSFLQVFNNSPDESSYYRHH 
FMRQDLTQSLIMIQPILYAYSFSGPPEPVLLDSSSILADRILLM 
DTFFQILIYHGETIAQWRKSGYQDMPEYENFRHLLQAPVDDAQE 
ILHSRFPMPRYIDTEHGGSQARFLLSKVNPSQTHNNMYAWGQES 
GAP I LTDDVSLQVFMDHLKKLAVSSAA 


6104 


124 


732 


KVSEYIILSKDKILFHALAMLVLWSPWSAARGVLRNYWERLLR 
KLPQSRPGFPSPPWGPALAVQ\AQPCLQSQQMIPVEVKRI/RSL 
LDSIFWMAAPKNRRTIEVNRCRRRNPQKLIKVKNNIDVCPECGH 
L KQKHVLCAYCYE KVCKETAE I RRQ I GKQEGG P FKAP T I ET WL 
YTGETPSEQDQGKRI IERDRKRPSWFTQN 


6105 


3 


989 


PLHGACTS LVLQRFCHRRPRP CAPARPEDMRRPAAVPLLLLLC F 
GS QRAKAATACGR PRMLNR M VG GQDTQEG E WPWQVS I QRNG SH F 
CGGS L I AEQ WVLTAAHCFRNTS ETS L YQ VLLGARQL VQPG PHAM 
YARVRQVESNPLYQGTASSADVALVELEAPVPFTNYILPVCLPD 
PSVIFETGMNCWVTGWGSPSEEDLLPEPRILQKLAVPIIDT\PR 
CNLLYSKDTEFGYQPKTIKNDMLCAGFEEGKKDACKGDSAGPLV 
CLVGQS WLQAGVI S WGEGCARQNRPGVYIRVTAHHNWIHRI I PK 
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ID 
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beginning 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, b«Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine t M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








LQVQPSEVGRPEVTPPGPGAP 


6106 


3 


1302 


GRPPTAPHTGRPPTANRGDPRLDLKRGCARLLTSIESRGRPAAS 
AGLRRDRCALRRWPLRRAPLARATRRRAGSPRRCAPRPRACPQG 
WSRARHQPGGLCLLLLLLCQFMEDRSAQAGNCWLRQAXNGRCQV 
LYKTELSKEECCSTGRLSTSWTEEDVNDNTLFKWMIFNGGAPNC 
IPCKETCENVDCGPGKKCRMNKKNKPRCVCAPDCSNITWKGPVC 
GLDGKTYRNECALLKARCKEQPELEVQYQGRCKKTCRDVFCPGS 
STCV\VDQTNNAYCVTCNRICPEPASSEQYLCGNDGVTYS\SAC 
HLRKATCLLGRSIGLAYEGKCIKAKSCEDIQCTGGKKCLWDFKV 
GRGRCSLCDELCPDSKSDEPVCASDNATYASECAMKEAACSSGV 
LLEVKHSGSCNSISEDTEEEEEDEDQDYSFPISSILEW 


6107 


623 


168 


SRCSSPRPEPGRGRGK/LSPSEHRKWVEVFKACDEDHKGYLSRE 
DFKTAWMLFGYKPS K I EVDS VMS S INPNTSG I LLEGFLNI VRK 
KKEAQRYRNEVRH I FTAFDTYYRGFLTLEDFKKAFRQVAPKLPE 
RTVLEVFREV \ DRDS \ DGHVSF 


6108 


3 


1348 


GGSLRFSPPRVPS.CSRVFCPVPPGGCGLPSPMSASRPQSPTTPW 
CLPRRYMKHKRDDGPEKQEDEAVDVTPVMTCVFWMCCSMLVLL 
Y Y F YD LL V Y W I G I FCLAS ATGL YS CLAP CVRRLP \ S AS AG E S A 
LLAPT I PNNSLPYFHKRPQARMLLLALFCVAVS WWGVFRNSDQ 
WAWVLQDALGIAFCLYMLKTIRLPTFKACTLLLLVLFLYDIFFV 
FITP FLTKSGS S I MVEVATGPSDS ATREKLPMVLKVPRLNSS PL 
ALCDRPFSLLGFGDILVPGLLVAYCHRFDIQVQSSRVYFVACTI 
AYGVGLLVTFVALALMQRGQPALLYLVPCTLVTSCAVALWRREL 
GVFWTGSGFAKVLPPSPWAPAPADGPQPPKDSATPLSPQPPSEE 
PATSPWPAEQS P KS RTS EE MG AGAP MRE PGS P AE S EGRDQAQ P S 
PVTQPGASA 


6109 


1 


1381 


CRSRAGAASGGAILEGTKLRRQRVDTNKPLDPLVPSALRAAMLY " 

LEDYLEMIEQLPMDLRDRFTEMREMDLQVQNAMDQLEQRVSEFF 

MNAKKNKPEWREEQMASIKKDYYKALEDADEKVQLANQIYDLVD 

RHLRKLDQELAKFKMELEADNAGITE I LERRS LELDTPSQP VNN 

HHAHSHTP VEKRKYNPTSHHTTTDH I PEKKFKS EALLSTLTSDA 

SKENTLGCRNNNSTASSNNAYNVNSSQPLGSYNIGSLSSGTGAG 

G I \TMAAAQAVQATAQMKEGRRTS SLKAS YEAFKNNDFQLGKE F 

SMARETVGYSSSSALMTTLTQNASSSAADSRSGRKSKNNNKSSS 

QQSSSSSSSSSLSSGSSSSTWQEISQQTTWPESDSNSQVDWT 

YDPNEPRYCICNQVSYGEMVGCDTQDCPIEWFHYGCVGLTEAPK 

GKWYCPQCT\AAMKRRGSRHK 


6110 


77 


2464 


ACPSAATMSDQDHSMDEMTAWKIEKGVGGNNGGNGNGGGAFSQ 
ARSSSTGSSSSTGGGGQESQPSPLALLAATCSRIESPNENSNNS 
QGPSQSGGTGELDLTATQLSQGANGWQIISSSSGATPTSKEQSG 
SSTNGSNGSESSKNRTVSGGQYWAAAPNLQNQQVLTGLPGVMP 
N IQYQVI PQFQT VDGQQLQFAATGAQVQQDGSGQ IQ 1 1 PGANQQ 
I ITNRGSGGNI IAAMPNLLQQAVPLQGLANNVLSGQTQYVTNVP 
VALNGNI TLLPVNS VSAATLTPS SQAVTI SSSGSQESGSQPVTS 
GTTI S S AS LVSSQAS S S S FFTNANS YSTTTTTSNMG I MNFTTSG 
SSGTNSQGQTPQRVSGLQGSDALNIQQNQTSGGSLQAGQQKEGE 
Q\NQQTQAAPKSLSRPQLVQGG\QALQ\AFQAAPLSGQTFTTQA 
ISQETLQNLQLQAVPNSGPI I IRTPTVGPNGQVSWQTLQLQNLQ 
VQNPQAQTITLAPMQGVSLGQTSSSNTTLTPIASAASIPAGTVT 
VNAAQLSSMPGLQTINLSALGTSGIQVHPIQGLPLAIANAPGDH 
GAQLGLHGAGGDG IHDDTAGGEEGENS PDAQPQAGRRTRREACT 
CPYCKDSEGRGSGDPGKKKQHICHIQGCGKVYGKTSHLRAHLRW 
HTGERPFMCTWSYCGKRFTRSDELQRHKRTHTGEKKFACPECPK 
RFMRSDHLS KHI KTHQNKKGGPGVALS VGTLPLDSGAGSEGSGT 
ATPSALITTNMVAMEAICPEGIARLANSGINVKEGGQFCSPINT 
SANGF 
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ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

corresponding 

CO I 1 ITS c 
aiTiJ.nQ aCXU 

residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine , T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /«=possible nucleotide deletion, 
\»possible nucleotide insertion) 


6111 


1*37 


797 


RVDPRVRGAMAPWGKRIAGVRGVLLDISGVLYDSGAGGGTAIAG 
SVEAVARLKRSRLKVRFCTNESQKSRAELVGQLQRLGFDISEQE 
VTAPAPAACQILKERGLRPYLLIHDGV\ASEFDQIDTS/STPNC 
WiADAGESFSYQNMNNAFQVLMELEKPVLISLGKGRYYKETSG 
LMLDVGP YMKALE YACG I KAEVGG KPS PE FFKS ALQA IGVEAHQ 
AVMIGDDIVGDVGGAQRCGMRALQVRTGKFRPSDEHHPEVKADG 
YVDNLAEAVDLLLQHADK 


6112 


77 


196 


MSSHKSFKSKR FLAKKQK PNR P I LQW I WLKTGNK I RHNW K 


6113 


1779 


567 


W EGRS W AACG VNLQGAWG E RS G VRAS EAES PG KRAD VS W W S RQL 
ETMVDHLANTE I NSQR I AAVES CFGASGQP LALPGR VLLGEG VL 
TKECRKKAKPRIFFLFNDILVYGSIVLNKRKYRSQHIIPLEEVT 
LELLPETLQAKNRWMIKTAKKSFWSAASATERQEWISHIEECV 
RRQLRATGR P A \ S T EHAA P W I PD KATD I CMR CTQTRFS ALTR RH 
HCRKCRVWCAECSRQRFLLPRLSPKPVRVCSLCYRELAAQQRK 
EEAEEQGAGVPRAASHLARPICGRPVEMTMTPTRTRRAAGTATG 
PAAWSSTPRGWPGLPSTADPRPAEHLSPSQLHCPGPQEGSSRSC 
PGLRDPIPWWQVQRWGVALSGLPVPFCWTLCPYGFTAGNAFPFR 
KPQNTHRSW 


6114 


818 


24£ 


PTSRPRPSPGSPAMSWSACVSAAPSSSWPASSSWPCGPRRCCTR 
RRRCSPRCGLAAGSMCSCSPSWRCTPVPACWPSPPP\PAEQVQC 
GHL P PHADRRALRLP VAAP ARG PG PGHPAGPAGPRPARTP PAS P 
HG PGR PT VP AP P CP LLAATE P T PS R PHQR WTRE DRMLGRG SQ VT 
GRPQW FLRGLVLFS L 


6115 


324 


71 


DVCGR VCAHPHL YTH IHMHI CAHAC \ I HTHAQLC7 1 TASHALAH 
SHLYTCMVMLTASHTPSHTHPHTAVHKEHRADVLRGTLTPLR 


6116 


595 


1430 


TGVMPPGRWHAA/ISSSGPVFEGARA\LQTVKKEEEDESYTPVQ 
AARPQTLNRPGQELFRQLFRQLRYHESSGPLETLSRLRELCRWW 
LRPDVLSKAQILELLVLEQFLSILPGELRVWVQLHNPESGEE\L 
WPCWRS CRGTLMGHPGGTRALP \ EPRCALDG YRS \ LRSAQI WS L 
AS PLRSSSALGDHLEPPYEI EARDFLAGQSDTPAAQMPALFPRE 
GCPGDQVTPTRSLTAQLQETMTFKDVBVTFSQDEWGWLDSAQRN 
LYRDVMLENYRNMASLGK 


6117 


1433 


222 


VGVPS PAP PCS WE VG PGGGWT PG I LKEGQGGRRTPLLLLATRTR 
GLLSLFPPAAMHPAAFPLPVWAAVLWGAAPTRGLIRATSDHNA 
SMDFADLPALFGATLSQEGLQGFLVEAHPDNACS P IAPPPPAP V 
NGSVFIALLRRFDCTNFDLKVLNAQKAGYGAAVVHNVNSNELLNM 
VWNSEEIQQQIWIPSVFIGERSSEYLRALFVYEKGARVLLVPDN 
TFPLG Y YL I P FTG I VGLLVLAMGAVMIARC I QHRKRLQRNRLTK 
\EQLKQI \ PTHDYQKGDQYDVCAI CLDEYEDGDKLRVLPCAHAY 
HSRCVDPWLTQTRKTCPICKQPVHRGPGDEDQEEETQGQ3EGDB 
GEPRDHPASERTPLLGSSPTLPTSFGSLAPAPLVFPGPSTDPPL 
SPPSSPVILV 


6118 


1044 


247 


STISCRACTSGATPGAQSHRSARGHAAGGKETAALGMERGKVKK 
^l^aJUiiyKEKIGEKGREEKVKRKEVEQKIKQEKQEKQERRKGK 
EKEEKRTKQGKETNKEKEQFKGQEEKGENKDSTLTRTPLEPLEK 
NKQILVLGLDGAGKTSVLHSLASNRVQHSVAPTQGFHAVCINTE 
DSQMEFLEIGGSKPFRSYWEMYLSN/ADSLARSFSVGFKQDSQP 

ITWKAKKYLHQLIAANPVLPLWFANKQDLEAAYHITDIHEALA 
II 


6119 


1217 


462 


DPRFVTENTTKAPAQERTTQPRSSREGTLRSTMEYLSALNPSDL " 

LRSVSNISSEFGRRVWTSAPPPQRPFRVCDHKRTIRKGLTAATR 

QE LLAKALETLLLNGVLTLVLEEDGTAVDS EDFFQLLEDDTCLM 

VLQSGQSWSPTRSGVLSYGLGRERPKHSKDIARFTFDVYKQNPR 

DLFGSLNVKATFYGLYSMSCDFQGI>\GPKKVLRELLRWTSTLLQ 

GLGHMLLGISSTLRHAVEGAEQWQQKGRLHSY 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=*Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine / 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Aeparagine, 
P=Proline, Q=Glutamine, R»Arginine, 
S=Serine, T«Threonine, V= Valine, 
W«Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6120 


785 


179 


LERAGGGGLSSRALVGSGACLSLVARANGKGLPRGRKEFVEAVR 
VRWAFRYRTPRAVCLRLWSCRREVIMSGRGKQGGKVRAKAKSR 
SS RAG LQFPVGRVHRLLRKGNYAERVGAGAPVYLAAVLEYLTAE 
ILELAGNAARDNKKTRIIPRHLQLAIRNDEELNKLLGKVTIAQG 
G\VLPNIQAVIjLPKKTESQKDEGANDP 


6121 


1612 


107 


F VRAOARGS ROPVRR P LLGAGS R LR C R S PHP MP V T . w w ww a t r m 
RGNGLRAVTPLRPGELLFRSDPLAYTVCKGSRGWCDRCLLGKE 
KLMRCSQCRVAKYCSAKCQKKAWPDHKRECKCLKSCKPRYPPDS 
VRLLGRWFKLMDGAPSESEKLYSFYDLESNINKLTEDKKEGLR 
QLVMTFQHFMREEIQDASQLPPAFDLFEAFAKVICNSFTICNAE 
MQEVGVGLYPSISLLNHSCDPNCSIVFNGPHLLLRAVRDIEVGE 
ELTICYLDMLMTSEERRKQLRDQYCFECD\CFRCQTQDKDADML 
TGDEQVWKE VQESLKKIEELKAHWKWEQVLAMCQAI I SSNSERL 
PDINIYQLKVLDCAMDACINLGLLEEALFYGTRTMEPYRIFFPG 
S H P VRG VQ VM K VG KLQLHQGM F PQAMKNLRLAFD I MR VTHGR EH 
SLIEDLI LLLE/ AMRRQHQS ILRERSQREIRRVSLLNALLRSHT 
LCFVSCVNLSYWKFCSVFV 


6122 


2 


2324 


RFRKMADGGAASQDESSAAAAAAADSRMNNPSETSKPSMESGDG - 

NTGTQTNGLDFQKQPVPVGGAISTAQAQAFLGHLHQVQLAGTSL 

QAAAQSLNVQSKSNEESGDSQQPSQPSQQPSVQAAIPQTQLMLA 

GGQITGLTLTPAQQQLLLQQAQAQAQLLAAAVQQHSASQQHSAA 

GAT I S ASAATPMTQ I PLS Q P IQIAQDLQQLQQLQQQNLNLQQFV 

LVHPTTNLQPA\QFIISQTPQGQQGLLQA\QNLLTQLPRQSQAN 

TjTiOSOPT? T\ TT 1 TGr\T3B , T , D r Pr"T 1 TB BTDTrtT»T riAPArrnrM/n TnrT , nn 
xjjj^ovr-rtx \ lux jyrH j. rlLl ±MJ\l FXy 1 JjJtQSQSTPKRIDTPS 

LEEP\SDLEELEQ FAKT F KQ RR I KLG FT \ QG DAGLAMV Kb YGND 

FSPTTIFRFEALNLSFKNMCKLKPLLEKWLNDAENLSSDSSLSS 

PSALNSPGIEGLSRRRKKRTSIEA\NIRVALEKSFLEN\QKPTS 

EEITMIADQLNMEKGVIRVWFCNRRQKEKRINPPSSGG\TSSSP 

1 KA I F P S PTSLVATTPS LVTS S AATTLT VS P VLP LTS AAVTNLS 

VTGTSDTTSNNTATVISTAPPASSAVTSPSLSPSPSASASTSEA 

SSASETSTTQTTSTPLSSPLGTSQVMVTASGLQTA/AQLLPFKG 

AAQL PANASLAAMAAAAGLN PSLMAPSQFAAGGALLSLNPGTLS 

GALS PALMSNSTLATIQALASGGSLP ITSLDATGNLVFANAGGA 

PNIVTAPLFLNPQNLSLLTSNPVSLVSAAAASAGNSAPVASLHA 

TSTSAES IQNSLFTVASASGAASTTTTASKAQ 


£123 


3 


2944 


H LLHRW FGTDMQM I NFTTG E FQLTEAC P YLGTHS E ES RFG I LH L 
HLQPLEMKRVGWFTPADYGKVTSLILIRNNLTVIDMIGVEGFG 
ARELLKVGGRLPGAGGSLRFKVPESTLMDCRRQLKDSKQILSIT 
KNFKVENIGPLPITVSSLKINGYNCQGYGFEV^DCHQFSLDPNT 
SRDISIVFTPDFTSSWVIRDLSLVTAADLEFRFTLNVTLPHHLL 
PLCADWPGPSWEESFWRLTVFFVSLSLLGVILIAFQQAQYILM 
EFMKTRQRQNASSSSQQNNGPMDVISPHSYKSNCKNFLDTYGPS 
DKGRGKNCLPVNTPQSRIQNAAKRSPATYGHSQKKHKCSVYYSK 
HKTSTAAASSTSTTTEEKQTSPLGSSLPAAKEDICTDAMRENWI 
SLRYASGINVNLQKNLTLPKNLLNKEENTLKNTIVFSNPSSECS 
M KEGI QT CM FPKE TD I KTS ENTAE FKERELC P LKTS KKLPENHL 
PRNSPQYHQPDLPEISRKNNGNNQQVPVKNEVDHCENLKKVDTK 
PSSEKKIHKTSREDMFSEKQDIPFVEQEDPYRKKKLQEKREGNL 
QNLNWSKSRTCRKNKKRGVAPVSRPPEQSDLKLVCSDFERSELS 
SDINVRSWCIQESTREVCKADAEIASSLPAAQREAEGYYQKPEK 
KCVDKFCSDSSSDCGSSSGSVRASRGSWGSWSSTSSSDGDKKPM 
VDAQHFLPAGDSVSQNDFPSEAPI SLNLSHWI CNPMTGNSLPQY 
AEPSCPSLPAGPTGVEEDKGLYSPGDLWPTPPVCVTSSLNCTLE 
NG VPCV IQESAP VHNS F I DWS ATCEGQFS S AYCPLELND YNAFP 
E ENMN YAN G FP C PAD VQTDF I DHN S QS TWNTP P \NM PAS \ WGNA 
QFPSSSRPYLKSTPKACLPMSGLFGPI\WAP\QSDVYENCCPIN 
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NO: 
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beginning 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corr e sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








PTTEHSD/ THMENQA\ WCKEYYPGF \NPFRAYMNLDI WTTT \A 
NRNANFPLSRDSSYCGNV 


6124 


1573 


236 


SDEALRLAGERGMGRVQLFE I SLSHGRWYSPGEPLAGTVRVRL 
GAPLPFRAIRVTCIGSCGVSNKANDTAWWEEGYFNSSLSLADK 
GSLPAGEHSFPFQFLLPATAPTSFEGPFGKIVHQVRAAIHTPRF 
S KDHKCS LVFY I LS PLNLNS I PDI EQ PNVASATKKFS YKLVKTG 
SWLTASTDLRGYWGQALQLHADVENQSGKDTSPWASIiLQKV 
SYKAKRWIHDVRTIAEVEGAGVKAWRRAQWHEQILVPALPQSAL 
PGCSLIHID Y YLQ VS LKA P E ATVTL P VF I GN I AV /NPCPSEPPA 
R PGAAS WG P TPGG \ PS AP PQ EEAE AB AAAGG PH F LD P VFLS T KS 
HSQRQPLLATLSSVPGAPEPCPQDGSPASHPLHPPLCISTGATV 
PYFAEGSGGPVPTTSTLILPPEYSSWGYPYEAPPSYEQSCGGVE 
PSLTPES 


612S 


1 


904 


KTCPKLTCAFTVSVPDSCCRVCRGDGELSWEHSDGDIFRQPANR 
EARHSYHRSHYDPPPSRQAGGLSRFPGARSHRGALMDSQQASGT 
IVQIVIIWKHKHGQVCVSNGKTYSHGESWHPNLRAFGIVECVLC 
TCNVTKQECKKIHCPNRYPCKYPQKIDGKCCKVCPG / KKAKEEL 
PGQSFDNKGYFCGEETMPVYESVFMEDGETTRKIALETERPPQV 
EVHVWTIRKGILQHFHIEKISKRMFEELPHFKLVTRTTLSQWKI 
FTEGEAQISQMCSSRVCRTELEDLVKVLYLERSEKGHC 


6126 


1224 


389 


RLLSEAPCPRSRRRFQMNPEWGQAFVHVAVAGGLCAVAVFTGIF 

DSVSVQVGYEHYAEAPVAGLPAFLAMPFNSIiVNMAYTLLGLSWL 

HRGGAMGIjGPRYLKDVFAAMALLYGPVQWLRLWI^WRRAAVLDQ 

WLTLPIFAWPVAWCLYLDRGWRP\WLFLSLECVSLASYGLALLH 

PQGFEVALGAHWPAVGQALRT\HRHYG/SATPSATYLALGVLS 

CLGFVVLKLCDHQLARWRLFQCLTGHFWSKVCDVLQFHFAFLFL 
THFNTHPR FH P S GG KTR 


6127 | 


1335 


463 


VLPRRCLVFVTOTMDSSREPTLGRLDAAGFWQVWQRFDADEKGY 
I EE KE LDAF FLHMLM KLGTDDTVM KANLH K VKQQ FMTTQDAS KD 
GRIRMKELAGMFLSEDEWFLLLFRRENPLDSSVEFMQIWRKYDA 
DSSGFI S AAELRN FLRDLFLHHKKA I S EAKLEE YTGTMM K I FDR 
NKDGRLDLNDLARILALQENFLLQFKMDACSTEKRKGDFEKIFA 
YYDVSKTGALEGP \ EVDGF VKDMMELVQPS I SG VDLDKFRE I hh 
RHCDVNKDGKIQKSELALCLGLKINP 


612B 


2*11 


843 


TC^MSRRQLERWVWSSQQVQARGRNVRAPRLGKIAMGLEMSSKD 
SPGSLDGRAWEDAQKPQSAWCGGRKTRVYATSSRRAPPSEGTRR 
GGAARPEKTAEEGPPAAPGSLRHSGPLGPHACPTALPEPQVTSA 
MSSQWGIEPLYIKAEPASPDSPKGSS2TETEPPVALAPG\PAP 
TRCLPGHKEEEDGEGAGPGEQGGGKLVLSSLPKRLCLVCGDVAS 
G YHYG VAS CEACKAFFKRT I QGS I E YS CPASNE CE I TKRRRKAC 
QACRFTKCLRVGMLKEGVRLDRVRGGRQKYKRRPEVDPLPFPGP 
F PAG P LAVAGG PR KTAAP VNALiVSHL L WEPE KL YAM PDPAG P D 
GH LPAVATLCDL FDRE I WTI SWAKS I PGFSSLSLS DQMSVLQS 
VW^^VTJVLGVAORSLTLf)DR7»AVaP , VT.^/T.nPT7nIi^)o^lr•r n-or n\ 

AALLQLVRRLQALRLEREEYVLLKAIALANSDSVHIEDEPRLWS 
S CE KLLHEALLE YEAGRAGPGGGAERRRAGRLLLTLPLLRQTAG 
KVLAHFYGVKLEGKVPMHKLFLEMLEAMMD 


6129 


1764 


771 


ARFARSAHEGKMPKKKTGARKKAENRREREKQLRASRSTIDLAK 
HPCNASMECDKCQRRQKNRAFCYFCNSVQKLPICAQCGKTKCMM 
KSSDCVI KHAGVYSTGLAMVGAI CDFCEAWVCHGRKCLSTHACA 
CP LTDAE C \ VECERG VWDHGGR I FS CS FCHNFLCEDDQFEHQAS 
CQVLEAE TFKCVS CWRLGQHS CLRCKACFCDDHTRS KVFKQEKG 
KQPPCPKCGHETQETKDLSMSTRSLKFGRQTGGEEGDGASGYDA 
YWKNLSSDKYGDTSYHDEEEDEYEAEDDEEEEDEGRKDSDTESS 
DLFTNLNLGRTYASGYAHYEEQEN 


6130 


3 


577 


GRGGTMREYKWVLGSG\GVGKSALTV\QFVTCTFIEKYDPTIE 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P«Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V= Valine, 
WsTryptophan, Y=Tyrosine, X=Unknown, *»=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DFYRKEIEV\DSSPSVAGISWTQQGTEQF\ASMRDLYIKKGQGC 
IIiVYSLVNQQSFQ\DIKPMRDQIIRVKVSEKVPVl\LVGN\SVD 
LESEREVSSSEGRALAEEWGCPFMETSAXSKTMVDELFAEIVRQ 
MNYAAQPDKDDPCCSACNIQ 


6131 


3 


1811 


SSPREKTSDSSHRPSRHGFLFLRLVGLSPFSYLCVPPSRPVPGS 
PRSLSAMRLLPLAPGRLRRGSPRHLPSCSPALLLLVLGGCLGVF 
GVAAGTRRPNWLLLTDDQDEVLGGMTPLKKTKALIGEMGMTFS 
SAYVPSALCCPSRASILTGKYPHNHHWNNTLEGNCSSKSWQKI 
QE PNTFPAI LRSMCG YQTFF\AGKYLNE YGAPDAGGLEHVPLG W 
S Y W YALE KNS KY YNYTLS INGKARKHGENYS VD YLTDVLANVSL 
DFLDYKSNFEPFFMMTATP\APHSPWTAAPQYQKAFQNVFAPRN 
KWFNIHGTNKHWLIRQAKTPMTNS S IQFLDNAFRKRWQTLLSVD 
DLVEKLVKRLEFTGELNNTYIFYTSDNGYHTGQFSLPIDKRQLY 
E F D I K V PLL VRG PG I KPNQTS KML VAN I DLG P T I LD I AG YDLN K 

TQMDGMSLLPILRGASNLTWRSDVLVEYQGEGRNVTDPTCPSLS 
PG VS QC F P D CVCEDAYNNT YAC VRTMSALWN LQ YCE FDDQEV F V 
EVYNLTADPDQITNIAKTIDPELLGKMNYRLMMLQSCSGPTCRT 
PGVFDPGYRFDPRLMFSNRGSVRTRRFSKHLL 


6132 


96 


1241 


AAGLLPPGLVPEDPRRTRNLLPFGIQGPPFALSRPLFSCVESGW ' ' 
AW E AME P E FL YD LLQL P KG VE P P AE E ELS KGG KK K YL P PT S RKD 

PKFEELQKPA\VLMEWINATLLPEHIWRSLEEDMFDGLILHHL 
FQRLAALKLEAEDIALTATSQKHKLTWLBAVNRS\CSWRSGRP 
SGA/WESIFNKDLLSTLHLLVALAKRFQPDLSLPTNVQVEVITI 
ESTKSGLKSEKLVEQLTEYSTDKDEPPKDVFDELFKLAPEKVNA 
VKEAIVNFVNQKLDRLGLSVQNLDTQFADGVILLLLIGQLEGFF 
LHLKEFYLTPNSPAEMLHNVTLALELL/IGRGPAQLPC/LALK/ 
TIVNKDAKSTLRVLYGLFCKHTQKAHRDRTPHGAPN 


6133 


2 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 

TTVSVSQQPVSAPVPIAAHASVAGHLSTSTTVSSSGAQNSDSTK 

KTLVTLIANNNAGNPLVQQGGQPLILTQNPAPGLGTMVTQPVLR 

P VQ VM QNANHVTS SPVASQPIFITTQG F P VRNVR P VQNAMNQ VG 

I VLNVQQGQT VRP I TLVPAPGTQFVKPTVGVPQVFSQMTP VRPG 

STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 

TA'l-QPTSLGQLAVQSPGQSNQTTNPKLAPSFPSPPAVSIASFVT 

VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPVWSNNSSAH\ 

GSQRTSGPESSMKVTSSIPVFDLQDGGRKICPRCNAQFRVTEAL 

RGHM C YCC PEM VE YQ KKG KS LDS E P S VPS AAKP P S P E KTAP VAS 

/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 

GRDGGKVAQLTNFPKVATSFRCPHCTKRLKNNIRFMNHMKHHVE 

LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 

K I CE WAFE S E PLFLQHMKDTH KPGEMP YVCQ VCQ YRS S L YS E VD 

VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 

CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQIiEGLKPGTKVTIRA 

SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 

IQKRAVRKMSVMGRQTCLECSFEIPDFPNHFPTYVHCSLCRYST 

CCSRAYANHM INNHVPRKS P KYLALFKNS VSG I KLACTS CTFVT 

SVGDAMAKHLVFNPSHRSSSILPRGLTWIAHSRHGQTRDRVHDR 

NVKNMYP P PS FPTNKAAT VKS AGATPAE PEELLTPLAPALPS PA 

STATPPPTPTHPQALALPPLATEGAECLNVDDQDEGSPVTQEPE 

LASGGGGSGGVGKKEQLSVKKLRWLFALCCNTEQAAEHFRNPO 

RR I RRWLRRFQASQGENLEGKYLS FEAEE KLAEWVLTQREQQLP 

VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 

VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 

DTEVLSSDDRKENALQTVGTGEPWCDWLAILADGTVLPTLVFY 

RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 

RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKIQPL 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A*Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, ^Threonine , V=Valine ( 
W=Tryptophan, Y=Tyrosine, X«Unknown, +~Stop 
Codon, /^possible nucleotide deletion, 
\=poosible nucleotide insertion) 








DVCIKRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLVWLGEV 

LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 

LEEQLKLSGEHSESSTPRPRSSPEETIEPBSLHQLFEGESETES 
FYGFEEADLDLMEI 


S134 


2 


4256 


F VHG S MADTDL FMECEE E ELE P WQ K I S D V I ED S WED YNS VD KT 

TTVSVSQQPVSAPVPIAAHASVAGHLSTSTTVSSSGAQNSDSTK 

KTLVTLIANNNAGNPLVQQGGQPLILTQNPAPGLGTMVTQPVLR 

P VQ VMQNANHVT S S P VAS Q P I F I TTQG FPVRNVR P VQNAMNQVG 

IVLNVQQGQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 

STMPVRPTTNTFTTVIPATLTIRSTVPQSQSQQTKSTPSTSTTP 

TATQPTSLGQLAVQSPGQSNQTTNPKLAPSFPSPPAVSIASFVT 

VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPVVVSNNSSAH\ 

GSQRTSGPESSMKVTSS I PVFDLQDGGRKICPRCNAQFRVTEAL 

RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPSPEKTAPVAS 

/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 

GRDGGKVAQLTNFPKVATSFRCPHCTKRLKNNIRFMNHMKHHVE 

LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 

KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 

VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 

CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 

SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 

IQKRAVRKMSVMGRQTCLECSFEIPDFPNHFPTYVHCSLCRYST 

CCSRAYANHMINNHVPRKSPKYLALFKNSVSGIKLACTSCTFVT 

SVGDAMAKHLVFNPSHRSSSILPRGLTWIAHSRHGQTRDRVHDR 

NVKNM Y P P P S F P TN KAAT VKS AGAT P AE P EELLT P LAP AL P S PA 

S TAT P PP TPTH P QALALP PLATEGAE CLNVDDQDEGS P VTQE PE 

LASGGGGSGGVGKKEQLSVKKLRWLFALCCNTEQAAEHFRNPQ 

RRIRRWLRRFQASQGENLEGKYLSFEAEEKLAEWVLTQREQQLP 

VNEETLFQKATKIGRSLEGGFKISYEWAVRFMLRHHLTPHARRA 

VAHTLPKDVAENAGLFIDFVQRQ1HNQDLPLSMIVAIDEISLFL 

DT E VLS S DDRKENALQTVGTG E P WCD WLAI LADG TVLPTL VFY 

RGQMDQPANMPDS ILLEAKESGYSDDE IMELWSTRVWQKHTACQ 

RSKGMIiVMDCHRTHLSEEVliAMLSASSTLPAWPAGCSSKIOPL 

DVCIKRTVKNFLHKKWKEQAREMADTACDSDVLLQLVLVWLGEV 

LGVIGDCPELVQRSFLVASVLPGPDGNINSPTRNADMQEELIAS 

LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 

FYGFEEADLDLMEI 1 


6135 ; 


2 


4256 


FVHGSMADTDLFMECEEEELEPWQKISDVIEDSWEDYNSVDKT 
TTVS VSQQP VSAPVP I AAHASVAGHLSTSTTVS SSGAQNSDSTK 
KTLVTLIANNNAGNPLVQQGGQPLILTQNPAPGLGTMVTQPVLR 
P VQ VMQNANHVTS S P VASQP I F I TTQG F P VRNVR P VQNAMNQVG 
IVLNVQQGQTVRPITLVPAPGTQFVKPTVGVPQVFSQMTPVRPG 
S TMP VR PTTNTFTTVI PATLTIRS TVPQSQSQQTKSTPS TSTTP 
TATQPTSLGQLAVQS PGQSNQTTNPKLAPS FPS PPAVS I ASFVT 
VKRPGVTGENSNEVAKLVNTLNTIPSLGQSPGPWVSNNSSAH\ 
GSQRTSGPESSMKVTSS I PVFDLQDGGRKICPRCNAQFRVTEAL 
RGHMCYCCPEMVEYQKKGKSLDSEPSVPSAAKPPSPEKTAPVAS 
/THPSSTPIPALSPPY/TKVPEPNENVGDAVQTKLIMLVDDFYY 
GRDGGKVAQLTNF P KVATS FRCPHCTKRLKNN I RFMNHMKHHVE 
LDQQNGEVDGHTICQHCYRQFSTPFQLQCHLENVHSPYESTTKC 
KICEWAFESEPLFLQHMKDTHKPGEMPYVCQVCQYRSSLYSEVD 
VHFRMIHEDTRHLLCPYCLKVFKNGNAFQQHYMRHQKR\NVYH\ 
CNKCRVQFLFAKDKIEHKLQHHKTFRKPKQLEGLKPGTKVTIRA 
SRGQPRTVPVSSNDTPPSALQEAAPLTSSMDPLPVFLYPPVQRS 
I Q KRAVRKMS VMGRQT CLE CS FE I P D FPNHF PT Y VHCS LCR YST 
CCS RAYANHM INNHVPRKS PKYLALF KNS VSG I KLACTS CTFVT 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=UnJcnown, *»Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








S VGDAMAKHLVFNPSHRSSS I LPRGLTWI AHSRHGQTRDRVHDR 
NVKNMYPPPSFPTNKAATVKSAGATPAEPEELLTPLAPALPSPA 
STATPPPTPTHPQAIiALPPLATEGAECLNVDDQDEGS PVTQE PE 
LASGGGG S GG VG KKEQLS VKKLRWL FALCCNTEQ AAEH FRN PQ 
RRIRRWLRRFQASQGENLEGKYLSFEABEKLAEWVLTQREQQLP 
VNEETLFQ KATK I GRS L EGG F K I S YEWAVRFM LRHHLTPHARRA 
VAHTLPKDVAENAGLFIDFVQRQIHNQDLPLSMIVAIDEISLFL 
DTEVLSSDDRKENALQTVGTGEPWCDWLAIIiADGTVLPTLVFY 
RGQMDQPANMPDSILLEAKESGYSDDEIMELWSTRVWQKHTACQ 
RSKGMLVMDCHRTHLSEEVLAMLSASSTLPAWPAGCSSKIQPL 
DVCI KRWKKFLHKKWKEQAREMADTACDSDVLLQLVLVWLGEV 
LG VI GDC P E L VQRS F L VAS VL PG P DGN I NS PTRNADMQ EE L I AS 

LEEQLKLSGEHSESSTPRPRSSPEETIEPESLHQLFEGESETES 
FYGFEEADLDLMEI 


6136 


1704 


539 


FGVRMALEGMSKRKRKRSVQEGENPDDGVRGSPPEDYRLGQVAS 
SLFRGEHHSRGGTGRLASLFSSLEPQIQPVYVPVPK\ESALASA 
DLEEEIHQKQGQKRKNSQPGVKVADRKILDDTEDTWSQRKKIQ 
INQEEERLKNERTVFVGNLPVTCNKKKLKSFFKEYGQIESVRFR 
SLIPAEGTLSKKLAAIKRKIHPDQKNINAYWFKEESAATQALK 
RNG AQ I ADG FR I R VD LAS ETS S RD KRS VF VGNL P YKVE E S AI EK 

HFLDCGSIMAVRIVRDKMTGIGKGFGYVLFENTDSVHLALKLNN 
S E LMGR KLR VMRS VNKE KF KQQNS N PRLKNVS KPKQGLN F TS KT 
AEGH P KSLF I GE KAVL LKTKKKGQKKS GR P KKQRKQK 


6137 


141 


2656 


RALRKRRCGPGRRGALGSGPGPQRRPGRVPEERPAPPRERKHPG 
MWNMLIVAMCLA\LLGLPGKAQELQGHVS\IILAGEQLGDLAKK 
YLWQG\LFQLYLDEAGRGHSFSFHGAALTAPKQGQELMAKALES 
LSCPKDMAPSHCAEHKDQFLQLSQYRQLKTAEDYQALNKD1EAQ 
LQHAGLREAGGIFYFSVPPFAYEDIARNINSSCRPGPGAWLRW 
LEKPFGHDHFSAQQLATELGTFFQEEEMYRVDHYLGKQAVAQIIi 
PFRDQNRKALDGLWNRHHVERVEIIMKETVDAEGRTSFYEEYGV 
IRDVLQNHLTEVLTLVAMELPHNVSSAEAVLRHKLQVFQALRGL 
QRGSAWGQYQSYSEQVRRELQKPDSFHSLTPTFAGVLVHIDNL 
RWEGVPFILMSGKALDERVGYARILFKNQACCVQSEKHWAAAQS 
QCL PRQL V FH I GHGD LGS PA VL VSRNL FR PSLPSSWKEMEGPPG 
LRLFGSPLSDYYAYSPVRERDAHSVLLSHIFHGRKNFFITTENL 
LASWNFWTPLLESLAHKAPRLYPGGAENGRLLDFEFSSGRLFFS 
QQQPEQLVPGPGPGPMPSDFQVLRAKYRESSLVSAWSEELISKL 
ANDIEATAVRAVRRFGQFHLALSGGSSPVALFQQLATAHYGFPW 
AHTHLWLVDERCVPLSDPESNFQGLQAHLLQHVRIPYYNIH\AM 
PVHLQQRLCAEEDQGAHIYAREISALGANSSFDLVLLGMGADGH 
TASLFPQSPTGLDGEQLWLTTSPSQPHRRMSLSLPLINRAKKV 
AVLVMGRMKREITTLVSRVGHEPKKWPISGVLPHSGQLVWYMDY 
DAFLG 


6138 


4587 


$34 


EFSKLTDRWQNAVQGVRQRKGDVDGLVRQWQDFTTSVENLFRFL 
iui £>Hijij&AvK<jybKf bbiQrKSLIHELKNKEIHFQRRRTTCAL 
TLEAGEKLLLTTDLKTKESVGRRISQLQDSWKDMEPQLAEMIKQ 
FQSTVETWDQCEKKIKELKSRLQVLKAQSEDPLPELHEDLHNEK 
EL I KELEQS LAS WTQNLKELQTM KADLTRHVIjVEDVMVLKEQ I E 
HLHRQWEDLCLRVAIRKQEIEDRLNTWWFNEKNKELCAWLVQM 
ENKVLQTADISIEEMIEKLQKDCMEEINLFSENKLQLKQMGDQL 
IKASNKSRAAEIDDKLNKINDRWQHLFDVIGSRVKKLKETFAFI 
QQLDKNMSNLRTWLAR I ESELS KP WYDVCDDQEIQKRLAEQQD 
LQRDIEQHSAGVESVFNICDVLLHDSDACANETECDSIQQTTRS 
LDRRWRNICAMSMERRMKIEETWRLWQKFLDDYSRFEDWLKSAE 
RTAAC PNS S E VL YTS AKE E LKR FE AFQRQ I H ERLTQLEL I NKQ Y 
RRLARENRTDTASRLKQMVHEGNQRWDNLQRRVTAVLRRLRHFT 
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Glutamic Acid, F=Phenylalanine, G=Glycirie, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T:=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X*Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\spossible nucleotide insertion) 








NQREEFEGTRESILVWLTBMDLQLTNVEHFSESDADDKMRQLNG * 
FQQEITLNTNKIDQLIVFGEQLIQKSEP\LDAVLIEDELEELHR 
YCQEVFGRVSRFHRRLTSCTPGLEDEKEASENETDMEDPREIQT 
DSWRKRGESEEPSSPQSLCHLVAPGHERSGCETPVSVDS\IPLE 

wdhtgrrggpsssh\eedeeaqyy\salsgksisdghswhvpds 
pscpehhykqmegdrnvppvppasstpykppygklllppgtdgg 
kegprvlngnpqqedgglagiteqqsgafdrwemiqaqelXhnk 
lkikqnlqqlnsdisaittwlkkteaelemlkmakppsdioeie 

LRVKRLQE I LKAFDTYKAIiWS VNVSS KE FLQTES PESTELQSR 
LRQLSLLWEAAQGAVDSWRGGLRQSLMQCQDFHQLSQNLLLWLA 
SAKNRRQKAHVTDPKADPRALLECRRELMQLEKELVERQPQVDM 
LQEISNSLLIKGHGEDCIEAEEKVHVI\EKKLKQLREQVSQDLM 
ALQGTQNPASPLPSFDEVDSGDQPPATSVPAPRAKQFRAVRTTE 
GEEETESRVPGSTRPQRSFLSRWRAALPLQLLLLLLLLLACLL 
PSSEEDYSCTQANNF\ARSFYPMLRYTNGPPPT 


6139 


52 


1131 


LGDWVWSRTCGVLETPTSVLRRARARGPCPTDSKWALPRLREGE 
TERRPWEASSWKTL/LAGWIGGAASVIVGHPLDTVKTRLQAGVG 
YGNTL S C I R WYR RE SMFGF F KGMS FP LAS I AVYNS WFG VFSN 
TQR FLS QHRCGE PEAS PPRTLS DLLLAS MVAG WS VGLGG P VDL 
IKIRLQMQTPPVSGRQPRFEVQGSGSCG\EPAYQGPVHCITTIV 
RNEGLAGLYRGASAMLLRDVPGYCLYF I PYVFLSEW I TPEACTG 
PS PCAVWLAGGMAGAI S WGTATPMD WKSRLQADGVYLNKYKGV 

LDCISQSYQKEGLKVFFRGITVNAVRGFPMSAAMFLGYELSLQA 
IRGDHAVTSP 


6140 


694 


13 6 


RPELELWRLRSRSWRPLGVPRRCHRRNWKBPVRAQPLSVTVWAP 
RCQRP/QPPAPEPSSPNAAVPEAI PTPRAAASAALET,PT/3PAPV 
SVAPQAEAEARSTPGPAGSRLGPETFRQRFRQFRYQDAAGPREA 
FRQLREL/ SPRQWLRPDI \RTKEQ\ I VEMLVQEQbLAILPEAAR 
ARRIRRRTDVRITG 


6141 


2 ; 


984 


AQVGPRSRPCKMPLKLRGKKKAKSKETAGLVEGEPTGAGGGSLS ' 
ASRAPARRLVFHAQLAHGSATGRVEGFSSIQELYAQIAGAFEIS 
PSEILYCTLNTPKIDMERLLX3GQLGLEDFIFAHVKGIEKEVNVY 
KS E D S LGLT I TDNG VG YA F I KR I KDGG VI DS VKTI CVGDH I E S I 
NGENIVGWRHYDVAKKLKELKKEELFTMKLIEPKKAFEIELRSK 
AGKSSGEKIGCGRATLRLRSKGPATVEEMPSETKAK\AIEKIDD 
VLELYMG IRD I DLATTMFEAGKDKVNPDEFAVALDETLGDFAFP 
DE F VFDVWGVI GDAKR'RGL 


6142 


116 


602 


EAEGEQVCGAKCCGDAPHVENREEETARIGPGVMESKEERALNN 
LIVENVNQENDEKDEKEQVANKGEPLALPLNVSEYCVPRGNRRR 
FR VRQ PILQYRWDI MHRLGE PQ ARMR EENMER I G E E VRQLME KL 
REKQLSHSLRAVSTDPPHHDHHDEFC\LMP 


6143 


2802 


270 


FRMRIFLHCPWNQQMWKIWNLLETSLESCKAHtiSIQKLLKER\Q 
\QLPVFKHRDSIVETLKRHRVWVAGET\GSGKSTQVPHFLLED 
LLLNEWEASKCNIVCTQPRRISAVSLANRVCDELGCENGPGGRN 
SrXGYQIRMESRACESTRLLYCTTGVLLRKLQEDGLLSNVS/HM 
FIVDEV\HER\SVQSDFLLIILKEILQKRSDLHLILMSATVDSE 
KFSTYFTHCPILRISGRSYPVEVFHLEDIIEETGFVLEKDSEYC 
QKFLEEEEEVTINVTSKAGGIKKYQEYIPVQIX3AHAI)LNPFYQK 
YS SRTQHA I L YMN P 1 1 K I NLDL I LELLAYLD KS PQ FRN I EGAVL I 
FL PGLAH I QQL YDLL S NDR R FY S ER Y KVI ALHS I LS TQDQAAAF 
TLPP PGVRKI VIiATNIAETGITI PDWFVIDTGRTKENKYHESS 
QMSSLVETFVSKASALQRQGRAGRVRDGFCFRMYTRERFEGFMD 
YSVPEILRVPLEELCLHIMKCNLGSPEDFLSKALDPPQLQVISN 
AMNLLRKIGACELNEPKLTPLGQHLAALPVNVKIGKMLIFGAIF 
GCLDP VATLAAVMTEKS PFTTP I GRKDEADLAKSALAMADSDHI* 
TI YNAYLGWKKARQEGGYRSE I TYCRRNFLNRTSLLTLEDVKQE 
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S=Serine, T=Threonine , V=Valine, 
W= Tryptophan, Y-Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








LIKLVKAAGFSSSTTSTSWEGNRASQTLSFQEIALLKAVLVAdL ' 
YDNVG KI I YTKSVDVTE KLACI VETAQGKAQVHPS S VNRDLQTH 
GWLLYQEKIRYARVYLRETTLITPFPVLLFGGDIEVQHRERLLS 
IDGWIYFQAPVKIAVIFKQLRVLIDSVLRKKLENPKMSLENDKI 
LQIITELIKTENN 


6144 


1289 
» 


568 


SGPGSMSGQRVDVKWMLGKEYVGKTSLVERYVHDRFLVGPYQN " 
VSASGGARHGGRGSGGPVICTYGPDLFPLVA\TIGAAFVAKVMS 
VGDRTVTLG I WDTAGS ER YE AMS R I YYRGAKAAI VC YDLTDS S S 
FERAKFWVKELRSLEEGCQIYLCGTKSDLLEEDRRRRRVDFHDV 
QDYADNI KAQLFETSS KTGQS VDELFQKVAEDYVSVAAFQVMTE 
DKGVDLGQKPNPYFYSCCHH 


6145 


1109 


196 


GGMDLSELERDNTGRCRLSSPVPAVCRKEPCVLGVDEAGRGPVL 

GPMVYAICYCPLPRLABLEALKVADSKTLLESERERLFAKMEDT 

DFVGWALDVLSPNLISTSMLGRVKYNLNSLSHDTATGLIQYALD 

QGVNVTQVFVDTVGM PET YQART.QQS FPG I E VTVKAKADALYP V 

\VSAAS I CAKVARDQAVKKWQF VEKLQDLDTD YG\SG YPNDPQD 

/TKAWLKEHVEPVF\GFP\QFVRF\SWRTAQTI\LEKEAEDVIR 

EDSASENQEGLRKITSYFLNEGSQARPRSSHRYFLERGLESTTS 
L 


6146 


428 


781 


L KKKGKE KAE AQQ VE AL PG P S LDQ WH RSAG E E E DGP VLTD EQKS 
R/YPGHEAHDQGG\WDARQSIIRKWDPETGRTRLIKGDGEVLE 
EIVTKERHREINKQATRGDCLAFQMRAGLLP 


6147 


1 


2304 


GTRQLPPPS PGSGPGDS PE6PEGEAPERRRKAHGMLKLYYGLSE 
GEAAGRPAGPDPLDPTDLNGAHFDPEVYLDKLRRECPLAQLMDS 
E TDM VRQ I RALDS DMQTL VYEN YNKF I S ATDT I R KMKND FR KME 
DEMDRLATNMAVITDFSARISATLQDRHERITKLAGVHALLRKL 
QFLFELPSRLTKCVELGAYGQAVRYQGRAQAVLQQYQHLPSFRA 
IQDDCQVITARLAQQLRQRFREGGSGAPEQAECVELLLALGEPA 
E E LCE E FLAHARG RLE KEL RN LEAE LG P S P PAP D VLE FTDHG \ S 
SGF VGGLCQ VAAAYQEL FAAQGPAGAE KLAAFARQLGSR YFALV 
ERRLAQEQGGGDNSLLVRALDRFHRRLRAPGALLAAAGLADAAT 
EIVERVARERLGHHLQGLRAAFLGCLTDVRQALAAPRVAGKEGP 
GLAEL LANVAS S I LS H I KAS LAAVHL FTAKEVS FSN KP Y FRG E F 
CSQGVREGLIVGFVHSMCQTAQSFCDSPGEKGGATPPALLLLLS 
RLCLD YETATI S YI LTLTDEQFLVQDQFPVTP VS TLCAEARETA 
RRLLTHYVKVQGLVISQMLRKSVETRDWLSTLEPRNVRAVMKRV 
VEDTTAIDVQVLPRLAGVALTQAGGTVPSRGAGAAEDHWQSLPG 
GGDMCIWASHGASSVARASVREPQGNKSPRMNTKRAGECLCPRS 
CSFSAQDYDIFAPILPVEKQRLRVTQEVRAGLVLVLKIRPQTNS 
CILPLPHSTGSINSDHVPTK 


614B 


3056 


353 


VPAVGGTFADGAMGEAEKFHYIYSCDLDINVQLKIGSLEGKREQ 
KSYKAVLEDPMLKFSGLYQETCSDLYVTCQVFAEGKPLALPVRT 
SYKAFSTRWNWNEWLKLPVKYPDLPRNAQVALTIWDVYGPGKAV 
PVGGTTVSLFGKYGMFRQGMHDLKVWPWCRSQMDQKPTKTPGRT 
SSTLSEDQMSRLAKLTKAHRQGHMVKVDWLDRLTFREIEMINES 
VKRSSNFM YLMGG FRCVKCDDKE YG I VY YEKDGDES S P ILTS FE 
LVKVPDPQMSLENLVESKHHNLPRSLRSGPSDHDLKPYPSPRDQ 
LKNIVSYPPSKPPTYEEQDLVWEFRYYLTNQDKALTKILTSVIW 
DL PQGAKQALALLG KWKPMDVEDS LELLSSHYTNPTVRR YAVAR 
LRQADDEDLLMYLLQLVQALKYENFDDIKNGLEPTKKDSQSSVS 
ENVSNSGINSAEIDSSQIIT/SAPFPSVSSPPP\ASKTKEVPDG 
ENLEQDLCTFLISRASKNSTLANYLYWYVIVECEDQDTQQRDPK 
THEMYLNVMRRFSQALLKGDKSVRVMRSLLAAQQTFVDRLVHLM 
KAVQRESGNRKKKNERLQALLGDNEKMNLSDVELIPLPLEPQVK 
I RG I I PE TATLFKS ALM P AQL FF KTEDGG KYPVI FKHG DDLRQD 
Q L I LQ 1 1 S LMDKLLRKENLDL KLT P YKVLATS TKHG FMQ F I QS V 
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W=Tryptophan, Y=Tyrosine, X« Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«>posoible nucleotide insertion) 








PVAEVLDTEGSIQNFFRKYAPSENGPNGISAEVMDTYVKSCAGY" 
CV I T Y I LG VGDRHLDNLLLTKTG KL FH IDFGYILGRDPKPLPPP 
MKLNKEMVEGMGGTQSEQYQEFRKQCYTAFLHLRRYSNLILNLF 
SLMVDANIPDIALEPDKTVKKVQDKFRLDLSDEEAVHYMQSLID 
ESVHALFAAWEQIHKFAQYWRK 


6149 


1 


1413 


R VDPRVRENGTANP I KNGKTS PAS KDQRTGKKTS VQGQVQ KGND "" 

ESESDFESDPPSPKSSEEEEQDDEEVLQGEQGDFNDDDTEPENL 

GHRPLLMDSEDEEEEEKHSSDSDYEQAKAKYSDMSSVYRDRSGS 

GPTQDLNTILLTSAQLSSDVAVETPKQEFDVFGAVPFFAVRAQQ 

P QQ E KNE KNL PQHR F P AAGLEQEE FDV FT KAP FS KKVNVQE CHA 

VGPEAHTIPGYPKSVDVFGSTPFQPFLTS7SKSESNEDLFGLVP 

FDEITGSQQQKVKQRSLQKLSSRQRRTKQDMSKSNGKRHHGTPT 

S TKKTLKP T YR TPE RARRH KKVGRRDS QS SNE FLT I S DS KEN I S 

VALTDGKDRGNVLQPEESLLDPFGAKPFHSPD\LSWHPP\HQGL 

S\DIRADHNT\VLPGR\PRQNSLHGSFHSADVLKMDDFGAVP/F 

LTELWQSITPHQSQQSQPV\ELDPFGAAPFPSKQ 


6150 


372 


37 


MSNIKKYIIDYDWKASIEIEIDHDVMTEEKLHQINNFWSDSEYR 
LN KHGS VLNAVL I MLAQHALL I A I S S DLNAYG WCE FDWNDGNG 
QEGWPPMDGSEGIRITDIDTSGIF 


6151 


1555 


521 


DSNQQS VSGTAAS TLLHS FKATI Y YQGTGHVQQF YGVTS PYSQT 
T P P I VQS YAQ P S LQY I QG QQ I FTAHP QGWVQP AAAVTT I VA P G 
QPQPLQPSEWWTNNLLDIiPPPS PPKPKTI VTjPPNWKTAP nppr* 

KIYYYHVITRQTQWDPPTWESPGDDASLEHEAEMDLGTPTYDEN 
PM K \ AS KK P KTAE ADT S S E LAKK S KE VFR KEM S Q FI VQ CLN P YR 

KPDCKVG\RITTTEDFKHLARKLTHGVMNKELKYCKNPE\DLEC 
NENVKHKTKEYIKKYMQKFGAVYKPKEDTEFRVTVGPGWEDGWS 
GKTDSRERKSCGPFCSTPVSTVLLMIHHPGBFNPADVN 


6152 


1366 


648 


NRTWSTPSTWMGVALPPLCSTGPWPVTRQITARTTCGAVPAKCP 
PWC/DVHEPRCQPPDCHGHGTCVDGHCQCTGHFWRGPGCDELDC 
GPSNCSQHGLCTETGCRCDAGWTGSKTCSEECPLGWHGPGCQRPC 
KCEHHCPCDPKTGNCSVSRVKQCLQPPEATLRAGELSFFTRTAW 
LALTLALAFLLLISTAANLSLLLSRAERNRRLHGDYAYHPLQEM 
NGEPLAAEKEQPGGAHNPFKD 


6153 


2 


3368 


GRVGARSPGRAYALLLLLICFNVGSGLHLQVLSTRNENKLLPKH 
PHLVRQKRAWITAPVALLEGEDLSKKNP IAKIHSDLAEERGLKI 
TYKYTGKGITEPPFGIFVFNKDTGELNVTSILDREETPFFLLTG 
YALDARGNNVEKPLELRIKVLDINDNEPVFTQDVFVGSVEELSA 
AHTLVMKINATDADEPNTLNSKISYRIVSLEPAYPPVFYLNKDT 
GEIYTTSVTLDREEHSSYTLTVEARDGNGEVTDKPVKQAQVQIR 
ILDVNDNIPWENKVLEGMVEENQVNVEVTRIKVFDADEIGSDN 
WLANFTFASGNEGGYFHIETDAQTNEGIVTLIKEVDYEEMKNLD 
FSVIVANKAAFHKSIRSKYKPTPIPIKVKVKNVKEGIHFKSSVI 
S I YVSESMDRSSKGQI IGNFQAFDEDTGLPAHARYVKLEDRDNW 
ISVDSVTSEIKLAKLPDFESRYVQNGTYTVKIVAISEDYPRKTI 
TGTVLINVEDINDNCPTLIEPVQTICHDAEYVNVTAEDLDGHPN 
SGPFSFSVXDKPPGMAEKWKIARQESTSVLLQQSEKKLGRSEIQ 
FL I S DNQG FS C PE KQ VLTLTVCE VLHG S \ GCREAQHDS YVGLGP 
AAIALM I LAFLLLLLVPLLLLMCHCGKGAKGFTP I PGT I EMLHP 
WNNEGAPPEDKWPSFLPVDQGGSLVGRNGVGGMAKEATMKGSS 
SASIVKGQHEMSEMDGRWEEHRSLLSGRATQFTGATGAI\MTTE 
TT I TARATG AS RD VAGAQAAAVALNE E FL KN Y FTD KAAS YTEED 
ENHTAKDCLLVYSQEETESLNAS IGCCSFI EGELD0RFLDDLGL 
KFKTLAEVCLGQKIDINKEIEQRQKPATETSMNTASHSLCEQTM 
VNSENTYSSGSSFPVPKSLQEANAEKVTQEIVTERSVSSRQAQK 
VATPLPDPMASRNVIATETSYVTGSTMPPTTVILGPSQPQSLIV 
TERVYAPASTLVDQPYANEGTVWTERVIQPHGGGSNPLEGTQH 
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Codon, /^possible nucleotide deletion, 
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LQD VP Y VM VRERES FLAP S S G VQ PTLAMPN I AVGQNVTVTER VL 
APASTLQSSYQIPTENSMTARNTTVSGAGVPGPLPDFGLEESGH 
SNSTITTSSTRVTKHSTVQHSYS 


6154 


3660 


Sl4d 


KKKTKMKNTLQKTVNFGAWPKPTISDKSHLLQMVSKLDLTDAKN 
SDTAH I KS I E I TS I LNGLQASES SAEDSEQEDERGAQDMDNNGK 
EESKIDHLTNNRNDLISKEEQNSSSLLEENKVHADLVISKPVSK 
S P ERLRKD I EVLS EDTD YEEDEVTKKRKDVKKDTTDKS S KPQ I K 
R G KRR YCNTEE CIjKTGS P GKKEE KAKNKES LCMENS SNS S S DED 
EEETKAKMTPTKKYNGLEEKRKSLRTTGFYSGFSEVAEKRIKLL 
NNSDERLQNSRAKDRKDVWSSIQGQWPKKTLKELFSDSDTEAAA 
SPPHPAPEEGVAEESLQTVAEEESCSPSVELEKPPPVNVDSKPI 
EEKTVEVNDRKAEFPSSGSNFSA* IPLPYLHLNRLHQSL*QKGS 
RQQSSVTVSEPLAPNQEEVRSIKSETDSTIEVDSVAGELQDLQS 
ERE* LASRF* CQCELEQ+ * SARTRTS * KSLYRSEKSERCSGRRK 
FI KKAEKKP * SNSGKQQKEGK 


6155 


869 


121 


HLLFELRGKSWITMKYVFYIjGVLAGTFFFADSSVQKEDPAPYLV 
YLKSHFNPCVGVLIKPSWVLAPAHCYLPNLKVMLGNFKSRVRDG 
TEQT INP I Q I VR Y WNYS HSAPQDDLML I KLAKPAMLNPKVQALN 
P\PTTNVRPGTVCLLSGLDWSQENSGRHPDLRQNLEAPVMSDRE 
CQKTEQGKSHRNSLCVKFVKVFSRIFGEVAVATVICKDKLQGIE 
VGHFMGGDVGI YTNVYKYVS WIENTAKDK 


6156 


5725 


3984 


GTST VTMATKKHFS I ILNLLGMLLKKDNQDTRKLLMTWALEVAV 
VMKKSETYAPLFCLPSFHKFCKGLLADTLVEDVNICLQACSSLH 
ALSSSLPDDLLQRCVDVCRVQLVHRGTCIRQAFGKLLKSIPLGV 
FLSNNNHTEIQEISLALRSHMSKAPSNTFHPQDFSD/VISFILY 
GNSHRTGKDNWLERLFYSCQRLDKRDQSTIPRNLLKTDAVLWQW 
A I WEAAQ FTVLS KLRTPLGRAQDT FQT I EG 1 1 R SLAGHT liNPDQ 
DVSQWTTADNDEGHGNNQLRLVLLLQYLENLEKLMYNAYEGCAN 
ALTSPPKVIRTFLYTNRQTCQDWLTRIRLSIMRVGLLAGQPAVT 
VRHGFDLLTEMKTTSLSQGNELEVS IMMWEALCELHCPEAI QG 
IAVWSSSIVGKHLLWINSVAQQAEGRFEKASVEYQEHLCAMTGV 
DCCISSFDKSVLTliASAGCKSASLKHCLNGESRKSVLSKPTDSS 
PEVINYLGNKACECYISTADWAAVQEWQNAIHDLKKSTSSTSLN 
LKADFNYIKSLSSFESGKFVECTEQLELLPGENINLLAGGSKEK 
IDMKKLLRNM 


6157 


946 


329 


MANRGPSYGLSREVQEKIEQKYDADLENKLVDWIILiQCAEDIEH 
PPPGRAHFQKWLMDGTVLCKL I NSLYPPGQEP I PKI SESKMAFK 
QMEQ I S Q FLKAAE T YG VRTTD I FQTVDLW EG KDMAAVQRTLMAL 
GSVAVTKDDGCYRGEPSWFHRKAQQNRRGFSEEQLRQGQNVIGL 
QMGSNKGASQAGMTG YGM PRQ I M* DAAS CP 


6158 ; 


441 


1482 


LGSLIVLSLHCKVIFSSQSLERAMKEKAVDLVPILAQNPGLAQN 
PILEGKDHNQNTGVDPI IDHVQDRKTD/ SRSKSPHKKRSKSRER 
RKSRSRSHSRDKRKDTREKIKEKERVKEKDREKEREREKEREKE 
KERGKNKDRDKEREKDREKDKEKDREREREKEHEKDRDKEKEKE 
QDKEKEREKDRSKEIDEKRKKDKKSRTPPRSYNASRRSRSSSRE 
RRRRRSRSSSRSPRTSKTIKRKSSRSPSPRSRNKKDKKREKERD 
HISERRERERSTSMRKSSNDRDGKEKLEKNSTSLKEKEHNKEPD 
S S VS KE VDDKDAP RTE EN KI QHNGNCQLNEENLS T KTEAV 


6159 


53 • j 


84 


AVIAPLHISLGDRARPYLKNTEKSSTTCSRRRNQSFPPVMSLTH 
RLHLCKYWGCAVSNVCRFWEGRPLPLMIWPYTLPVSLPVGSCV 
IITGTPILTFVKDPQLEVNFYTGMDEDSDIAFQFRLHFGHPAIM 
NSCVFGIWRYEEKCYYLPFEDGKPFELCIYVRHKEYKVMVNGQR 
I YNFAHRFP PAS VKMLQVFRD I SLTRVLI SD*GRCVRI TAVQE F 
DVSVSCDCTTAYQPG 


6160 


1626 


1790 


AGAKFFP + F * KVADAQ PTES EKE I YNQVNWLKDAEG I LEDLQS 
YRGAGHEIREAIQHPADEKLQEKAWGAWPLVGKLKKFYEFSQR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 

corrp qnnn H inn 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine , M=Me thionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, '-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LEAALRGLLGALTSTPYSPTQHLEREQALAKQFAEILHFTLRFD 
ELKMTNPAIQNDFSYYRRTLSRMR1NNVPAEGENEVNNELANRM 
SLFYAEATPMLKTLSDATTKFVSENKNLPIENTTDCLSTMASVC 
R VM LET P E YRS R FTNE ET VS FCLR VMVG V I I L YDKVH P VGAFAK 
TSKIDMKGCIKVLKDQPPNSVEGLLNALRYTTKHLNDETTSKQI 
KSMLQ*QLLTLVNKG 


6161 


455 


1569 


PVSGSESSLRRAWASILRLMLGPRVAVSILCEDGISH*LLEKH* 
KSHVLEPLSSLALEEQCLALSLDWSTGKTGRAGDQPLKIISSDS 
TGQLHLLM VNETRPRLQXVAS WQAHQFEAW I AAFNYWHPE I VYS 
GGDDGLLRGWDTRVPGKFLFTSKRHTMGVCSIQSSPHREHILAT 
GSYDEHILLWDTRNMKQPLADTPVQGGWRIKWHPFHHHLLLAA 
CMHSGFKILNCQKAMEERQEATVLTSHTLPDSLVYGADWSWLLF 
RSLQRAPSWSFPSNLGTKTADLKGASELPTPCHECREDNDGEGH 
ARPQSGMKPLTEGMRKNGTWLQATAATTRDCGVNPEEADSAFSL 
LATCS F YDHALHLWE WEGN 


6162 


X 


586 


RTIHATGRAGASPMHRLIVWRLAEANKQHVRCQKCLEFGHWTYE 
CTG KRKYLHR PS RTAE LKKALKE KENRLLLQQS I GETNVERKAK 
KKRSKSVTSSSSSSSDSSASDSSSESEETSTSSSSEDSDTDESS 
SSSSSSASSTTSSSSSDSDSDSSSSSKQ*HQHR*QL*R*TTKEE 
EKE I ELLHS Y WTDGLKTLM 


6163 


1081 


7B5 


RIRSTTEGCAVRLHPTQNTGKARIMILLSVSLGRHWAFTYKFFL 
T PWF VFFFFFFHRKE * VMQKNPMKSREDE WMEKLNNLHVQRAD 
MNRL I MN YLVTEGFKEAAEKFRMESG I EPS VDLETLDER I KI RE 
MILKGQIQEAIALINSLHPELLDTNRYLYFHLQQQHLIELIRQR 
ETEAALSFAQTQLAEQGEESRECLTEMERTLALLAFDSPEESPF 
GDLLHTMQRQKVWSEVNQAVLDYENRESTPKLAKLLKLLLWAQN 
ELUQKKVKYPKMTDLSKGVIEEPK 


6164 


90 


40* 


PCQSPGRSRMRQDKLTGSLRRGGRCLKRQGGGVGTILSNVLKKR 
SCISRTAPRLLCTLEPGVDTKLKFTLEPSLGQNGFQQWYDALKA 
VARLSTGI PKEWRRKVWLTLADHYLHS IAIDWDKTMRFTFNERS 
NPDDD S MG I Q I VKDLHRTG CSS YCGQE AEQDRWLKRVLLA YAR 
WNKTVGYCQGFNILAALILEVMEGNEGDALKIMIYLIDKVLPES 
YFVNNLRALS VDMA V FRD LLRMKL PE LS QHLDTLQRTAN KE S G G 
GYEPPLTNVFTMQWFLTLFATCLPNQTVLKIWDSVFFEGSEIIL 
RVSLAIWAKLGEQIECCETADEFYSTMGRLTQEMLENDLLQSHE 
LMQTVYSMAPFPFPQLAELREKYTYNITPFPATVKPTSVSGRHS 
KARDSDEENDPDDEDAWNAVGCLGPFSGFLAPELQKYQKQIKE 
PNEEQSLRSNNIAELSPGAINSCRSEYHAAFNSMMMERMTTDIN 
ALKRQYSRIKKKQQQQVHQVYIRADKGPVTSILPSQVNSSPVIN 
HLLLGKKMKMTNRAAKNAVIH I PGHTGGKI S P VP YEDLKTKLNS 
P WRTH I R VHKKNM PRTKSHPG CGD TVG L I DEQNEAS KTNGLGAA 
E AF PSG CTATAGR EG SS PEG STRRT I EGQSPEPVFGDADVDVS A 
VQAKLGALELNQRDAAAETE LR VH PPCQRHCPEPPSAPE ENKAT 

SKAPQGSNSKTPIFSPFPSVKPLRKSATARNLGLYGPTERTPTV 
HFPOMSRSFSKPGGGNSGP*KMVP c ?QrtTMT.qpnT.i3r»vDr»c*voTs^T 
GGERPG 


6165 


90 


406 


PCQSPGRSRMRQDKLTGSLRRGGRCLKRQGGGVGTILSNVLKKR 
SCISRTAPRLLCTLEPGVDTKLKFTLEPSLGQNGFQQWYDALKA 
VARLSTGI PKEWRRKVWLTLADHYLHS IAIDWDKTMRFTFNERS 
NPDDDSMGIQIVKDLHRTGCSSYCGQEAEQDRWLKRVLLAYAR 
WNKTVGYCQGFNILAALILEVMEGNEGDALKIMIYLIDKVLPES 
Y F VNN LRALS VDMAVFRDLLRM KLP E LS QHLDTLQRTANKE SGG 
GYEPPLTNVFTMQWFLTLFATCLPNQTVLKIWDSVFFEGSEIIL 
RVSLAIWAKLGEQIECCETADEFYSTMGRLTQEMLENDLLQSHE 
LMQT V YS MAP F P FPQLAE LREK YT YN ITP FPATVKPTS VS GRHS 
KARDS DE END P DDEDAWNAVG CLG PFSGFLAPELQ KYQ KQ I KE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spending 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= ' 
Glutamic Acid, F=>Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine # T=Threonine, V»Valine, 
W=» Tryptophan, Y=Tyrosine, X«=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PNEEQSLRSIWIAELSPGAINSCRSEYHAAFNSMMMERMTTDIN 
ALKRQYSRIKKKQQQQVHQVYIRADKGPVTSILPSQVNSSPVIN 
HLLLGKKMKMTNRAAKNAVIHIPGHTGGKISPVPYEDLKTKLNS 
PWRTHIRVHKKNMPRTKSHPGCGDTVGLIDBQNEASKTNGLGAA 
EAFPS GCTATAGREGSS PEGSTRRTI EGQS P E P VFGDAD VDVS A 
VQAKLGALELNQRDAAAETELRVHPPCQRHCPEPPSAPEENKAT 
SKAPQGSNSKTPIFSPFPSVKPLRKSATARNLGLYGPTERTPTV 
HFPOMSRSF^KPGf^N^nP* KT^\/rc:cr:TMT.QpnT.Dr:vDni?vnoM 

GGERFG 


6166 


2 


1206 


HKLWRTVAMAGAEWKSLEECLEKHLPLPDLQEVKRVLYGKELRK 
LDLPREAFEAASREDFELQGYAFEAAEEQLRRPRIVHVGLVQNR 
I PL PANAPVAE Q VS ALHRR I KAI VE VAAMCGVN 1 1 CFQEAWTMP 
FAFCTREKLPWTEFAESAEDGPTTRFCQKLAKNHDMWVSPILE 
RDSEHGDVLWNTAWISNSGAVLGKTRKNHIPRVGDFNESTYYM 
EGNIX3HPVFQTQFGRIAVNICYGRHHPLNWLMYSINGAEIIFNP 
SATIGALSESLWPIEARNAAIANHCFTCAINRVGTEHFPNEFTS 
GDGKKAHnriPflYPVTSQciYVazVPnciQPTPPT CPQPnPT t \rnirT nT 

NLCQQ VND V WN FKMTGR Y E M Y ARE LAE AVKSNYS PT I VKE * PAS 
VPALG 


6167 


1220 


1844 


YGIVTGPSLCAGDKQPKKQEKNPVLVSPEFVDEALCACEEYLSN 
LAHMDIDKDLEAPLYLTPEGWSLFLQRYYQWHEGAELRHliDTQ 
VQRCED I LQQLQAWPQ I DMEGDRN I W I VKPG AKSRGRG IMCMD 
HLEE M LKL VNGNP WMKDG KWWQ KY I E RPLL I FGT KFDLRQW F 
LVTDWNPLTVWFYRDSYIRFSTQPFSLKNLDK*APLYLTPEGWS 
LFLQRYYQWHEGAELRHLDTQVQRCEDILQQLQAWPQIDMEG 
DRNIWIVKPGAKSRGRGIMCMDHLEEMLKLVNGNPWMKDGKWV 
VOKYTPPPT.T.TPflTtf PnT.PflWFT .wpfiuixtdt ."n rw "c vd novTDfOT 

QPFSLKNLDK 


616B 


84 


1392 


VWPVPSVSAMPPKKQAQAGGSKKAEQKKKEKIIEDKTFGLKNKK 
GAKQQKFIKAVTHQVKFGQQNPRQVAQSEAEKKLKKDDKKKELQ 
ELNELFKPWAAQKISKGADPKSWCAFFKQGQCTKGDKCKFSH 
uui jjHKA.uaivKt) v i !UAK]JJ!.l!.ijliKJJIMDNWUbKKljEE VVNKKHG 
EAEKKKP KTQ I VCKHFLEAI ENNK YGWFWVCPGGGD I CMYRHAL 
P PGF VLKKKKKKKKKEDE I S L * DL I ERERS ALG PNVTKI TLES F 
LAWKKRKRQEKIDKLEQDMERRKADFKAGKALVISGREVFEFRP 
ELVNDDDEEADDTRYTQGTGGDEVDDSVSVNDIDLSLYIPRDVD 
ETGIT VASLERF^TYTSDJf DFNKT.c; pa QnriQliTfTjnwD q ht t?t?dm 

EREGTENGAIDAVPVDENLFTGEDLDELEEELNTLDLEE 


6169 


112 


662 


APAAAMAERPEDLNLPNAVlTRllKEALPDGVNISKEARSAISR 
AASVFVL YATS CANN FAMKGKRKTLNASDVLSAMEEME FQRF VT 
PLKEALEAYRREQKGKKEASEQKKKDKDKKTDSEEQDKSRDEDN 
DEDEERLEEEEOMEEEEVDN+KGRPTVAPWKVPT.PMi?PaTr , pr'T? 
AFPCWAE 


6170 


62 


667 


S T KVM L PNTGRLAG CT VF I TG AS RG I G KA I ALKAAKDG AN I V I A" ~ 
AKTAQ PHP KLLGT I YTAAEE I EAVGGKALPC I VDVRDEQQI S AA 
VE KAI KKFGGID I LVNNASAI S LTNTLDTPTKRLDLMMNVNTRG 
TYLASKACIPYLKKSKVAHIPNISPPLNLNPVWFKQHCGRW*W 
G * GDGL CL I CFELNLCMSDV ITICT 


6171 


382 


941 


HFMQSDVELDCDIEPCGHTKFPPTLPLSTTVIVCSCHPVATAST 
MAEAFSKTTSEEDQSIQEPKEANSMTAQKQKK*GLRGSRRHHAN 
SGGDI FGDS FAA Y F P R VL KQ VHQALS LSQ EAVS VMDS MVRD I LD 
RIATEAGHLAHYSKCVTITSRDIRMAVCLLLPGKMGKLAESQGT 
NATLRYTKSK 


6172 


651 


54 


GLCRAGGAHRFSRTHVEAALKMLRREARLRREYLYRKAREEAQR 
SAQERKERLRRALEENRLIPTELRREALALQGSLEFDDAGGEGV 
TSHVDDEYRWAGVEDPKVMITTSRDPSSRLKMFAKELKLVFPGA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence ' 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H*=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrooine, XoUnknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








QRMNRGRHEVGALVRACKANGVTDLLWHEHRGTPVGLIVSHLP 
FG PTA Y FTL CNWMRHD I PD LG TM S E AKPHL I THG FS S RLG KR V 
S D I LR YL F P VPKD DS HR V I T FANQDD Y I S FRHHV Y KKTDHRNVE 
LTEVGPRFELKLYMIRLGTLEQEATADVEWRWHPYTNTARKRVF 
LSTE*AAPRPLGQLL 


6173 


3 


288 


SVDHREVQVLSQSMPLTPHQAVLRGERPYMCVECGKCFGRSSHL ' 
LQHQRIHTGEKPYVCSVCGKAFSQSSVLSKHRTIHTGEKPYECN 
ECGKAFRVSSDLAQHHKIHTGEKPHECLECRKAFTQLSHLIQHQ 
RIHTGERPYVCPLCGKAFNHSTVLRSHQRVHTGEKPHRCNECGK 
TFSVKRTLLQHQRIHTGEKPYTCSECGKAFSDRSVLIQHHNVHT 
GEKPYECSECGKTFSHRSTLMNHERIHTEEKPYACYECGKAFVQ 
HSHL I QHQKVHRKXi* PTCVLS VGSALAGVPTS FS I S VSTLERS P 
MCAV YVGR PS ARAQS LVNTGQ FTQVRS PMS VMS VEKP LE 


6174 


1060 


959 


PRPPGKRWMVAG1XSNPGLPGTRHSVGMAVLGQLARRLGVAESWT 
RDRHCAADLALAPLGDAQLVLLRPRRLMNANGRSVARAAEljFGL 
TAEEVYLVHDELDKPLGRLALKLGGSARGHNGVRSCISCLNSNA 
M P RLR VG I GR PAHP E AVQAHVLGC F S PAE Q E L LPLLLDRAT DL I 
LDHIRERSQGPSLGP+H+WFSKKA 


6175 


2204 


334 


RYFRADPRSRSGQPRAEGLGAFAEGPLRAMAAPVKGNRKQSTEG 
DALDPPASPKPAGKQNGIQNPISLEDSPEAGGEREEEQEREEEQ 
AFLVSLYKFMKERHTPIERVPHLGFKQINLWKIYKAVEKLGAYE 

GEDDKPLPTSKPRKQYKMAKENRGDDGATERPKKAKEERRMDQM 
MPGKTKADAADPAPLPSQEPPRNSTEQQGLASGSSVSFVGASGC 
PEA YKRLL S S F Y C KG THG I M S PLAKKKLLAQ VS K VE ALQ CQ E EG 

CRHGAEPQASPAVHLPESPQSPKGLTENSRHRLTPQEGLQAPGG 
SLREEAOAGPCPAAPIFKCSPPYTHPTF'VT.VPvqn'HDPnppc'DT v 

DGVLLGPPGKEGLSVKEPQLVWGGDANRPSAFHKGGSRKGILYP 
KP KAC W VS PMAKV P AE S P TLP PT F P S S PGLG S KRS LEEEG AAHS 
GKRLRAVS P FLKEADAKKCGAKPAGSGLVSCLLG PALGP VP P E A 
YRGTMLHCPLNFTGTPG PLKGQAALP FS PLVI PAFPAHFLATAG 
P S PMAAGLMH F P P TS FDS ALRHRL C PAS S AWHAP P VTTYAA PH F 
FHLNTKL 


6176 


1040 


402 


PLS ALRAMAE VH V I GQ I IGASGFS ES S LFCKWG I HTGAAW KLLS 
GVREGQTQVDTPQIGDMAYWSHPIDLHFATKGLQGWPRLHFQVW 
SQDS FGRCQLAG YGFCHVPSS PGTHQlaACPTWRPLGS WREQLAR 
AFVGGGPQLLHGDTIYSGADRYRLHTAAGGTVHLEIGLLLRNFD 
RYGVEC*GTLPPTSPPSTPRTPSDGGGWHSGQEHRL 


6177 


1400 


992 


VPIESLVGKVHNFPLIAFYCCEKGKRQPHKSLHDRCFGEALDPN 
CSHCYLDQI KRSDFLGFSG YS PHFVAI STNSEHKMQPSSMQQAL 
PSQ*PYWTDPRPALVPCCSHRPDVHRSRPGPGLPGTSGCSDRPP 
VCPI 


6178 


1027 


254 


STQRGG I KG VARAAS LVGRRRAGTGMALLLCLVCLTAAIjAHGCL 
HCHSNFSKKFSFYRHHVNFKSWWVGDIPVSGALLTDWSDDTMKE 
LHLAIPAKITREKLDQVATAVYQMMDQLYQGKMYFPGYFPNELR 
NIFREQVHLIQNAIIESRIDCQHRCGIFQYETISCNNCTDSHVA 
CFGYNCESSAQWKSAVQGLLNYINNWHKQDTSMRPRSSAFSWPG 
THRAAPAFLVLPALRCLE PPHLANLSLEDAA* CLKQH 


6179 


806 


276 


RGETREMAGNLLSGAGRRLWDWVPLACRSFSLGVPRLIGIRLTL 
P P P KWDRWNEKRAMFGVYDN I GI LGNFEKHPKELIRGPI WLRG 
WKGNELQRCIRKRKMVGSRMFADDLHNLNKRIRYLYKHFNRHGK 
FR* KRKLRTS EKAHLS P WRR ET VLFP VRKRL C I FS V I KWGFFG I 


6180 


156 


1833 


DHH I L KAAS TTH VCARGNI FA I PNTR CL EC* ATAT P S S LE CQN * 
SHLSLCPLPATTSGLTPNSMIPEKERQNIAERLLRVMCADLGAL 
SWSGKEFLKLAQTLVDSGARYGAFSVTEILGNFNTLALKHLPR 
M YNQ VKV KVTCALGSNACLG I G VTCHS Q S VG P DSC Y I LTA YQAE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
Ws =Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








GNH I KS Y VLGVKGAD I RDSGDLVHHWVQN VLSEF VMSEIRTVYV 
TDCRVSTSAFSKAGMCLRCSACALNSWQSVLSKRTLQARSMHE 
VIELLNVCEDliAGSTGLAKETFGSLEETSPPPCWNSVTDSLLLV 
HER YEQ I CE FYS RAKKMNL I QS LNKHLLSNLAAI LTP VKQAV I E 
LSNESQPTLQLVLPTYVRLEKLFTAKANDAGTVSKLCHLFLEAL 
KENFK VH PAHKVAM I LDPQQKLR P VP P YQHEEI IGKVCEL INE V 
KESWAEEADFEPAAKKPRSAAVENPAAQEDDRLGKNEVYDYLQE 
PLFQATPDLFQYWSCVTQKHTKLAKLAFWLLAVPAVGARSGCVN 
MCEQALLI KRRRLLS PEDMNKLMFLKSNML 


6181 


169 


1032 


TRTLLSPVLLPGPRWKPWRRRPMGPLALPAWLQPRYRKNAYLFI ' 
YYLIQFCGHSWIFTNMTVRFFSFGKDSMVDTFYAIGLVMRLCQS 
VS LLE LLH I YVG I E SNH LLPR FLQ LTE R 1 1 1 L F W ITS QEEVQ E 
K Y VVC VL F VF WNLLDM VR YT Y SMLSVIGIS YAVLT WLS QTL WM P 
I YPLCVLAEAFAI YQSLP YFES FGTYSTKLPFDLS I YFPYVLKI 
YLMMLFIGMYFTYSHLYSERRDILGIFPIKKKKM*STAFQCDTR 
KDRLWIQCS K*NTGS I LVEKFLVF 


6182 


1769 


1224 


AS* IDYQLNTLLKEFQLTEENTKLRYLTCSLIEDMAAAYFPDCI " 

VRPFGSSVNTFGKLGCDLDMFLDLDETRNLSAHKISGNFLMEFQ 

VKNVPSERIATQKILSVLGECLDHFGPGCVGVQKILNARCPLVR 

FSHQASGFQCDLTTNNRIALTSSELLYIYGALDSRVRALVFSVR 

CWARAHSLTSS I PGAW ITNFSLTMMVI FFLQRRSPP I LPTLDSL 

KTLABAEDKCVIEGNNCTFVRDLSRIKPSQNTETLELLLKEFFE 

YFGNFAFDKNSINIRQGREQNKPDSSPLYIQNPFETSLNISKNV 

SQSQLQKFVDLARESAWILQQEDTDRPSISSNRPWGLVSLLLPS 

APNRKSFTKKKSNKFAIETVKNLLESLKGNRTENFTKTSGKRTI 

STQT 


6183 


1118 


452 


HLDRYIKSPGSGSSTPAPPSHLLLYLLHPQSTRTMGCCGCSRGC 
GSGCGGCGSSCGGCGSGCGGCGSGRGGCGSGCGGCSSSCGGCGS 
RCYVPVCCCKPVCSWVPACSCTSCGSCGGSKGGCGSCGGSKGGC 
GSCGCSQSSCCKPCCCSSGCGSSCCQSSCCKPCCCQSSCCVPVC 
CQSSCCKPCCCQSNCCVPVCCQCKI*GSGPRPSGFSCLVKAFLM 
VP 


6184 


1 


2191 


IVTVREEDGAPAVAPPGVWSRANKRSGAGPGGSGGGGARGAEE " 

EPPPPLQAVLVADSFDRRFFPISKDQPRVLLPLANVALIDYTLE 

FLTATGVQETFVFCCWKAAQIKEHLLKSKWCRPTSLNWRIITS 

ELYRSLGDVLRDVOAKALVRSDFLLVYGDVISNINITRALEEHR 

LRRKL+KNVSVMTMIFKESSPSHPTRCHEDNWVAVDSTTNRVL 

HFQKTQGLRRFAFPLSLFQGSSDGVEVRYDLLDCHISICSPQVA 

QLFTDN FDYQTRDD F VRGLLVNEE I LGNQ IHMHVTAKE YGARVS 

NLHM YSAVCAD V I RRWVYPLT PEANFTDS TTQS CTHSRHN I YRG 

PEVSLGHGSILEENVLLGSGTVIGSNCFITNSVIGPGCHIEPGD 

NWLDQTYLWQG VRVAAGAQ I HQSLLCDNAEVKERVTIiKPRS VL 

TSQVWGPNITLPEGSVISLHPPDAEEDEDDGEFSDDSGADQEK 

DKVKM KG YNPAE VGAAG KG YL WKAAGMNMEEEE ELQQNL WG LK I 

NMEEESESESEQSMDSEEPDSRGGSPQMDDIKVFQNEVLGTLQR 

GKEENISCDNLVLEIKTSLKYAYNISLKEVMQVLSHWLEFPLQQ 

MDSPLDSSRYCALLLPLLKAWSPVFRNYIKRAADHLEALAAIED 

F FLEH EALGI SMAKVLMAF YQLEI LAE ET I LS WFS QRDTTD KGQ 

QLRKNQQLQRFIQWLKEAEEESSEDD 


£185™- 


791 


44 


PCTSCVLWATLHLPASTRKAPQAECGMISITEWQKIGVGI'IXjFG ' 

IFFILFGTLLYFDSVLLAFGNLLFLTGLSLIIGLRKTFWFFFQR 

HKLKGTSFLLGGWIVLLRWPLLGMFLETYGFFSLFKGFFPVAF 

G FLGNVCNI PFLGALFRRLQGTSSMV* KTEMSSLNLDHWLKGAK 

REEWEPPPQSPALTHSPTYPGPPQVQKERNGAEQLTSNPQVDSR 

GCQEAEMQTPRRLGWGWYHTLTLYLWEEK 


6186 " 


5^9 


238 


VYGIDSSNTNTHGAEERNRKLKKliWKLCHAQSRLDVNGLALKMA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide'" 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
w=Tryptophan, Y=Tyrosine, X^Unknown, *-Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








KERKVKNKVKNKADTEEVPNNSPTNQEKMPTSAILPDFSGSVIS" 

NIRNQMETLHSQPHQEENLCFENSFSLINLLPINAVEPTSSQQI 

PNRE T S E ANKERR KM TS KSSE SN I YS PLTS F I TADS ELHD 1 1 KD 

LEDCLMVGLHTCGDLAPNTLRIFTSNSEIKGVCSVGCCYHLLSE 

EFENQHKERTQEKWGFPMCHYLKEERWCCGRNARMSACLALERV 

AAGQGLPTESLFYRAVLQDIIKDCYGITKCDRHVGKIYSKCSSF 

LDYVRRSLKKLGLDESKLPEKIIMNYYEKYKPRMNELEAFNMLK 

WLAPCIETLILLDRLCYLKEQEDIAWSALVKLFDPVKSPRCYA 

VIALKKQQ* FPLKQI IRCISL * DSAGCAEEVSVGDGGPALRDAP 

PSGSRVGSRYD 


6187 


1701 


771 


DAWGPETRLARILNPDSFIEPRPGRLPELEATRPHMEPKASCPA"" 
AA PLMERKFHVLVGVTGS VAALKLPLLVSKLLDI PGLEVAWTT 
ERAKHFYSPQDIPVTLYSDADEWEMWKSRSDPVLHIDIiRRWADL 
LLVAPLDANTLGKVASGICDNLLTCVMRAWDRSKPLLFCPAMNT 
AMWEHP I TAQQVDQLKAFGYVE I P CVAKKLVCGDEGLGAMAEVG 
TIVDKVKEVLFQHSGFQQS*PGISVMGVPLYSEWVQAKSVKMDV 
GKIGGYPHLLNGGPALSLPRGQACSRLNWTEGPGLSFFQPGEAA 
A 


6188 


238 


1534 


KGFVNAGPLMAELQVSPQWKAPEMSQICLSCGHPSA*GPRWASW 
NIGVFICIRCAGIHRNLGVHISRVKSVNLDQWTQEQIQCMQEMG 
NGKANRLYEAYLPETFRRPQIDPAVEGFIRDKYEKKKYMDRSLD 
INAFRKEKDDKWKRGSEPVPEKKLEPWFEKVKMPQKKEDPQLP 
RKSSPKS TAP VMDLLGLDAP VACS IANSKTSNTLEKDLDIjLASV 
PSPSSSGSRKWGSMPTAGSAGSVPENLNLFPEPGSKSEEIGKK 
QLSKDSILSLYGSQTPQMPTQAMFMAPAQMAYPTAYPSFPGVTP 
PNSIMGSMMPPPVGMVAQPGASGMVAPMAMPAGYMGGMQASMMG 
VPNGMMTTQQAGYMAGMAAMPQTVYGVQPAQQLQWNLTQMTQQM 
AGMNFYGANGMMNYGQSMSGGNEQAANQTLSPQMWK 


6189 


1297 


793 


LGE PLGDLCEL I PGDVQQLQMGE VHPGTGAQGS AAQS VAGE VQL ' 

TQLSHARQRPSCQGSQLIAIiDLQHMDlSRQPRWQHVQPVARQVQ 

RAQQAQLAEGVAVHLWAGDAWAEVELLQEVGGGKVFAANACDL 

WQDHEGA^AARQATGHALQRVIVQVRRVQPLEAL*RVPSGLPR 

RVRAFMILHNQITGIGREDFATTYFLEELNLSYNR1TSPQVHRD 

AFRKLRLLRSLDLSGNRLHMLPPGLPRNVHVLKVKRNBLAALAR 

GALAGMAQLRELYLTSNRLRSRALGPRAWVDLAHLQLLDIAGNQ 

LTE I P EGLPES LE YL YLQNNKI SAVPANAFDS TPNLKG I FLRFN 

KLAVG SWDSAFRRLKHLQVLDIEGNLEFGD I S KDRGRLGKEKE 

EEEEDEVEEEETR 


6190 


66 


1309 


ILVGNVSFLLSFAEYVCNCSWGSLNVNRCNQTTGQCECRPGYQ"" 
GLHCETCKEGFYLN YTSGLCQ PCDCS PHGALS I PCNSSGKCQCK 
VGVIGSICDRCQDGYYGFSKNGCLPCQCNNRSASCDALTGACLN 
CQENS KGNHCEECKEGFYQSPDATKECLRCPCSAVTSTGSCS IK 
SSELEPECDQCKDGYIGPNCNKCENGYYNFDSICRKCQCHGHVY 
PVKTPKICKPESGECINCLHNTTGFWCENCL*GYVHDLEGNCIK 
KVILPTPEGSTILVSNASLTTSVPTPVINSTFTPTTLQTIFSVS 
TSENSTSALADVSWTQFNI I ILTVI 1 1 VWLLMGFVGAVYMYRE 
YQNRKLNAPFWTI ELKEDNI S FSSYHDS I PNADVSGLLEDDGNE 
VAPNGQLTLTTPIHNYKA 


6191 


1212 


1511 


VNLCHGGLLHLSTHHLGIKPSMH*LFFLMLSFPHLTPQQPKCPS "" 

MIDWIKKIWYIYTMEYYATIKRNEIMFFAGTWMEMEAIILSKLM 

QDYMFSLISGS 


6192 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEAGIEAVGSAAEE 
KGGLVS DAYGEDD FS RLGGDEDG YEEEEDENSRQS EDDDS ETE K 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDE1KIPPEPPG 
RCSNHLQDKIQKLYERKI KEGMDMNYI IQRKKEFRNPS IYEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
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Tn 
1U 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine / D-Aspartic Acid, E= 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H«Histidine, I^Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AKKERTKIEFVTGTKKGTTTNATSTTTTTASTAVADAQKRKSKW" 

DSAIPVTTIAQPTILTTTATLPAWTVTTSASGSKTTVISAVGT 

IVKKAKQ 


6193 


3 


950 


TRGCGN KMAG K KN VL S S LAVYAED S E PES DGE AG I EAVG S AAE E ~ 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
R CSNHLQDK I QKL YBR K I KEGMDMN Y I IQRKKE FRNPS I YE KL I 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
AKKERTKI E FVTGTKKGTTTNATS TTTTTASTAVADAQKRKS KW 
DSAI PVTTI AQPTILTTTATLPAWTVTTSASGS KTTVISAVGT 
IVKKAKQ 


6194 


3 


950 


TRGCGNKMAGKKNVLSSLAVYAEDSEPESDGEAGIEAVGSAAEE 
KGGLVSDAYGEDDFSRLGGDEDGYEEEEDENSRQSEDDDSETEK 
PEADDPKDNTEAEKRDPQELVASFSERVRNMSPDEIKIPPEPPG 
RCSNHLQDKIQKLYERKIKEGMDMNYIIQRKKEFRNPSIYEKLI 
QFCAIDELGTNYPKDMFDPHGWSEDSYYEALAKAQKIEMDKLEK 
AKKERT KI E FVTGTKKGTTTNATS TTTTTASTAVADAQKRKS KW 
DSAIPVTTIAQPTILTTTATL.PAWTVTTSASGSKTTVISAVGT 
IVKKAKQ 


6195 


736 


235 


VANGLQSNMPKFYCDYCDTYLTHDSPSVRKTHCSGRKHKENVKD 
YYQKWMEEQAQSLIDKTTAAFQQGKIPPTPFSAPPPAGAMIPPP 
PSLPGPPRPGMMPAPHMGGPPMMPMMGPPPPGMMPVGPAPGMRP 
PMGGHMPMMPGPPMMRPPARPMMVPTRPGMTRPDR 


6196 


1512 


623 


KTG KRRS AA YVRN I LDNAE Q V I SNLE ARNLGPRLT PLLQEEDSH 
QRLLMGLMVS ELKDHFLRHLQGVE KKKI EQMVLD Y I S KLLDL I C 
H I VE TNWRKHNLH S WVLH FNS RGS AAE FAVFH I MTR I LE ATNS L 
FLPLPPGFHTLHTILGVQCLPLHNLLHCIDSGVLLLTETAVIRL 
MKDLDNTEKNEKLKFSIIVRLPPLIGQKICRLWDHPMSSNIISR 
NHVTRLLQNYKKQPRNSMINKSSFSVEFLPLNYFIEILTDIESS 
NQ AL Y P F EGHDNVDAE FVE EAAL KHTAMLLGL 


6197 


3 


819 


ADPEGTEEAVMSRYTRPPNTSLFIRNVADATRPEDLRREFGRYG 
P I VD V Y I PLDF YTR R P RG FAYVQ FEDVRDAEDALYNLNRKWVCG 
RQIEIQFAQGDRKTPGQMKSKERHPCSPSDHRRSRSPSQRRTRS 
RSSSWGRNRRRSDSLKESRHRRFSYSQSKSRSKSLPRRSTSARQ 
SRTPRRNFGSRGRSRS KSLQKRS KS IGKSQSSS PQKQTS SGTKS 

RSHGRHSDSIARSPCKSPKGYTNFETKVQTAKHSHFRSHSRSRS 
YRHKNSW 


6198 


111 


1912 


SEAALSPSFISPACFLLRKLPALEDGTLPHPDTLGMNYEGARSE 
RENHAAD D S EGGALDM CCS ERLPGLPQPIVM E ALDE AEG LQDS Q 
REMPPPPPPSPPSDPAQKPPPRGAGSHSLTVRSSLCLFAASQFL 
LACGVLWFSGYGHIWSQNATNLVSSLLTLLKQLEPTAWLDSGTW 
GVPSLLLVFLSGGLVLVTTLVWHLLRTPPEPPTPLPPEDRRQSV 
SRQPSFTYSEWMEEKIEDDFLDLDPVPETPVFDCVMDIKPEADP 
TSLTVKSMGLQERRGSNVSLTLDMCTPGCNEEGFGYLMSPREES 
ARE YLLS ASRVLQAEELHEKALDP FLLQAE FFE I PMNF VDPKE Y 
D I PGLVRKNR YKTI LPNPHSRVCLTS PDPDDPLSS YINANYIRG 
YGGEEKVYIATQGPIVSTVADFWRMVWQEHTPIIVKITNIEEMN 
EKCTEYWPEEQVAYDGVEITVQKVIHTEDYRLRLISLKSGTEER 
GLKHYWFTSWPDQKTPDRAPPLLHLVREVBEAAQQEGPHCAPII 
VHCS AG IGRTGCF I ATS I CCQQLRQEGWD I LKTTCQLRQDRGG 
M I QHCEQYQFVHHVMS L YEKQLSHQS PE 


6199 


144 


1211 


MARENGES S S S WKKQAED I KK I FEFKETLGTGAFS EWLAEEKA 
TGKLFAVKC I PKKALKGKESS I ENE I AVLRKI KHENI VALEDI Y 
ESPNHLYLVMQLVSGGELFDRIVEKGFYTEKDASTLIRQVLDAV 
YYLHRMGIVHRDLKPENLLYYSQDEESKIMISDFGLSKMEGKGD 
VMSTACGTPGYVAPEVLAQKPYSKAVDCWSIGVIAYILLCGYPP 
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SEQ 
ID 

Kin ♦ 


Predicted 
beginning 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine / K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, '^Threonine, V^Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) ; 








FYDENDSKLFEQILKAEYEFDSPYWDDISDSAKDFIRNLMEKDP 

NKRYTCEQAARHPWIAGDTALNKNIHESVSAQIRKNFAKSKWRQ 

AFNATAWRHMRKLHLGSSLDSSNASVSSSLSLASQKDCASGTF 
HAL* 


6200 


702 


96 


LPEVPHSLRPRVKPHLCCAQPAVRVMARLPKLAVFDLDYTLWPF 
W VDTHVD P P FHKS S DGTVRDRRGQDVRL YPEVPEVLKR LQS LG V 
PGAAASRTSEIEGANQLLELFDLFRYFVHREIYPGSKITHFERL 
QQKTGIPFSQMIFFDDERRNIVDVSKLGVTCIHIQNGMNLQTLS 
QGLETFAKAQTGPLRSSLEESPFEA 


6201 


2809 


2383 


GQTPRVRWKMRRSLRAGKRRQTAGRKSKSPPKVPIVIQDDSLPA 
GPPPQIRILKRPTSNGWSSPNSTSRPTLPVKSLAQREAEYAEA 
RKRILGSASPEEEQEKPILDRPTRISQPEDSRQPNNVIRQPLGP 
DGSQGFKQRR 


6202 


2 


426 


INADRAAVASSLLSRPTRKMAPQKDRKPKRSTWRFNLDLTHPVE 
DG I FDS GN FEQFLRE KVKVNG KTGNLGN WH I ERFKNKI TWS E 

KQFSKRYLKYLTKKYLKKNNLRDWLRWASDKETYELRYFQISQ 
DEDESESED 


6203 


419 


2550 


R C P R PPATAG AAAS R PDRS PPSGISGS E AAAGAGAAAP AS QH PA ' 

TGTGAVQTEAMKQILGVIDKKLRNLEKKKGKLDDYQERMNKGER 

LNQDQLDAVS KYQEVTNNLEFAKELQRSFMALSQDIQKTT KKTA 

RREQLMREEAEQKRLKTVLELQYVLDKLGDDEVRTDLKQGLNGV 

PIIiSEEELSLLDEFYKLVDPERDMSLRLNEQYEHASIHLWDLLE 

GKEKPVCGTTYKVLKEIVERVFQSNYFDSTHNHQNGLCEEEEAA 

S AP AVEDQ V P EAE P E PAEE YTEQS E VES T E YVNRQFMAE TQ FTS 

GEKEQVDEWTVETVEWNSLQQQPQAASPSVPEPHSLTPVAQAD 

P L VR RQR VQDLMAQMQG P YNFI QDS M LD FENQTLDPAI VS AQ PM 

NPTQNMDMPQLVCPPVHSESRLAQPNQVPVQPEATQVPLVSSTS 

EGYTASQPLYQPSHATEQRPQKEPIDQIQATISLNTDQTTASSS 

LPAASQPQVFQAGTSKPLHSSGINVNAAPFQSMQTVFKMNAPVP 

PVNEPETLKQQNQYQASYNQSFSSQPHQVEQTELQQEQLQTWG 

T YHG S P DQS HQ VTGNHQQP PQQNTG F P R SNQ P Y YNS RG VS RGGS 

RG ARGLMNG YRG PANG FRGG YDG YRPS FS NTPNSG YTQS Q FS AP 

RDYSGYQRDGYQQNFKRGSGQSGPRGAPRGRGGPPRPNRGMPQM 
NTQQVN 


6204 


2933 


787 


CTHWLISLLGGRALIHFNRFLNLKIQEGEAHNIFCPAYDCFQLV 
PGD 1 1 KSWS KEMDKRYLQFD IKAFVENNPAI KWCPTPGCDRAV 
RLTKQGSNTSGSDTLSFPLLRAPAVDCGKGHLFCWECLGEAHEP 
CDCQTWKNWLQKITEMKPEELVGVSEAYEDAANCLWLLTNSKPC 
ANC KS P I QKNEG CNHMQCAXC KYD F CW I CLEEW KKH S FVH WE V I 
YRCTRYEVIQHVEEQSKEMTVEAEKKHKRFQELDRFMHYYTRFK 
NHEHSYQLEQRLLKTAKEKMEQLSRALKETEGGCPDTTFIEDAV 
HVLLKTRRILKCSYPYGFFLEPKSTKKEIFELMQTDLEMVTEDL 
AQKVNRPYLRTPRHKIIKAACLVQQKRQEFLASVARGVAPADSP 
EAPRRS FAGGTWDWEYLGFAS PEEYAEFQYRRRHRQRRRGDVHS 
LLSNPPDPDEPSESTLDIPEGGSSSRRPGTSWSSASMSVLHSS 
SLRDYTPASRSENQDSLQALSSLDEDDPNILLAIQLSLQESGLA 
LDEETRDFLSNEASLGAIGTSLPSRLDSVPRNTDSPRAALSSSE 
LLELGDSLMRLGAENDPFSTDTLSSHPLSEARSDFCPSSSDPDS 
AGQDPNINDNLLGNIMAWFHDMNPQSIALIPPATTEISADSQLP 
C I KD GS EG VKD VE LVL P EDS M FEDAS VS EGRGTQ I E ENP LE EN I 
PGGGKQHPQAW 


6205 


1 


1200 


RAHRGKMALEVGDMEDGQLSDSDSDMTVAPSDRPLQLPKVLGGD 
SAMRAFQNTATACAPVSHYRAVESVDSSEES FSDSDDDS CLWKR 
KRQKCFNPPPKPEPFQFGQSSQKPPVAGGKKINNIWGAVLQEQN 
QDAVATELG I LGMEGT IDRSRQS ETYNYLLAKKLRKESQEHTKD 
LDKELDEYMHGGKKMGSKEEENGQGHLKRKRPVKDRLGNRPEMN 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine / C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine. I=Isoleuci np K-T.\/pH«f» 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=* Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








YKGRYEITAEDSQEKVADBISPRLQEPKKDLiARWlillGNKKA 
I ELLMETAEVEQNGGLFIMNGSRRRTPGGVFLNLLKNTPS I SEE 
Q I KD I F Y I ENQ KE YENKKAAR KRRTQ VLGKKM KQA I KS LN FQE D 
DDTSRE T FASDTNEALAS LD E SQEGHAE AKLEAEEAI EVDHSHD 
LDIF 


6206 


10 


1442 


irSERRERSCLHLVCIRCSCDWEMGSVIXSLCSMASWIPCLCGS 
APCLLCRCCPSGNNSTVTRLIYALFLLVGVCVACVMLIPGMEEQ 
LNKIPGFCENEKGWPCNILVGYKAVYRLCFGLAMFYLLLSLLM 
I KVKS S SDPRAAVHNGFWFFKFAAAI AI I IGAFFI PEGTFTTVW 
FYVGMAGAFCFILIQLVLLIDFAHSWNESWVEKMEEGNSRCWYA 
ALLSATALNYLLSLVAI VLFFVYYTHPASCS ENKAFI SVNMLLC 
VGASVMSILPKIQESQPRSGLLQSSVITVYTMYLTWSAMTNEPE 
TNCNPSLLS I IGYNTTSTVPKEGQSVQWWHAQGI IGLILFLLCV 
FYSSIRTSNNSQVNKLTLTSDESTLIEDGGARSDGSLEDGDDVH 
RAVDNERDGVTYSYS FFHFMLFLASLYIMMTLTNWYRYEPSREM 


6207 


2924 


1471 


T VMAEAATPG TTATT S GAGAAAATAAAAS PT P I P TVTAPS LGAG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SWCKY FQR G YC I YGDR CR YE HS KPLKQE EATATELTT KS S LAA 
SSSLSSIVGPLVEMNTGEAESRNSNFATVGAGSEDWVNAIEFVP 
GQPYCGRTAPSCTEAPLQGSVTKEESEKEQTAVETKKQLCPYAA 
VG E CR YG EN C VYLHGDS CDM CGLQ VLHPMDAAQRSQH I KS C I EA 
HE KDMELS FAVQRS KDMVCG I CME W YEKANPSERRFG I LSNCN 
HTYCLKCIRKWRSAKQFESKIIKSCPECRITSNFV1PSEYWVEE 
KEEKQKLILKYKEAMSNKACRYFDEGRGSCPFGGNCFYKHAYPD 
GRR E EPQRQKVGTS S R YRAQRRNH F WEL I EERENSNP FDNDEEE 

VVTFELifiEMT.T.MTT.2li2ir!<t!rjrtT?T.'rr4Qn , T>n | tiJriT uunrT cirvwvrT^T r^T 
v v x c c»xjvjiiriijjjpiijxjrt/Wj*al^Uc*Jj 1 UOCtUaVllJLib HlJJbiljEDFYDLDL 


6208 


2924 


1471 


T VMAEAAT P GTTATTS GAGAAAATAAAAS PTP I PTVTAPS LGAG 
GGGGGSDGSGGGWTKQVTCRYFMHGVCKEGDNCRYSHDLSDSPY 
SWCKYFQRGYCIYGDRCRYEHSKPLKQEEATATELTTKSSLAA 
SSSLSSIVGPLVEMNTGEAESRNSNFATVGAGSEDWVNAIEFVP 
GQPYCGRTAPSCTEAPLQGSVTKEESEKEQTAVETKKQLCPYAA 
VGECRYGENCVYLHGDSCDMCGLQVLHPMDAAQRSQHIKSCIEA 
HE KDMELS FAVQRSKDMVCGI CMEVVYEKANPSERRFG ILSNCN 
HT Y CLKC I R KWR S AKOFE S K T T v <? np prT3 T T Q wrcn/T n c w vui rc u 

KEEKQKLILKYKEAMSNKACRYFDEGRGSCPFGGNCFYKHAYPD 
GRREEPQRQKVGTSSRYRAQRRNHFWEL I BERENSNP FDNDEEE 
WTFELGEMLLMLLAAGGDDELTDSEDEWDLFHDELEDFYDLDL 


6203 


1758 


829 


ERLCFPCMQS KIYSYMS PNKCSGMRFPLQE ENS VTHHE VKCQGK 
PLAG I YRKREE KRNAGNAVRS AMKS EEQKI KDARKGPLVPFPNQ 
KS E AAE P P KT P P S S CDS TNAAI AKQALKKP I KG KQAPR KKAQGK 
TQQNRKLTDFYPVRRSSRKSKAELQSEERKRIDELIESGKEEGM 
KIDLIDGKGRGVIATKQFSRGDFWEYHGDLIEITDAKKREALY 
AQDPSTGCYMYYFQYLSKTYCVDATRETNRLGRLINHSKCGNCQ 
TKLHD I DGVPHL I LIASRDI AAGEELL.YDYGDRS KAS IEAHPWL 
KH 


6210 


3761 


387 


I FGMS KLRMVLLEDS GSADFRRHFVNLS PFT I T WLLLS AC FVT 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEVJGTVCNNGWSMEAV 
S VI CNQLGCPTAI KAPGWANS SAGSGR I WMDHVS CRGNESALWD 
CKHDGWGKHSNCTHQQDAG VTCSDGSNLEMRLTRGGNMCSGR I E 
IKFQGRWGTVCDDNFNIDHASVICRQLECGSAVSFSGSSNFGEG 
SGPIWFDDLICNGNESALWNCKHQGWGKHNCDHAEDAGVICSKG 
ADLSLRLVDGVTECSGRLEVRFQGEWGTICDDGWDSYDAAVACK 
QLGCPTAVTAIGRVNASKGFGHIWLDSVSCQGHEPAVWQCKHHE 
WGKHYCNHNEPAGVTCSDGSDLELRLRGGGSRCAGTVEVEIQRL 
LGKVCDRG WGLKEAD WCRQLGCGSAIiKTS YQVYS K I QATNTWL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCyeteine, D=Aspartic Acid, Ea 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, KaLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V<=Valine, 
W=Tryptpphan, Y«Tyrosine, X=UnJcnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FLSSCNGNETSLWDCKNWQWGGLTCDHYEEAKITCSAHREPRLV" 
GGDI PCSGRVEVKHGDTWGS I CDSDFSLEAASVLCRELQCGTVV 
SILGGAHFGEGNGQIWAEEFQCEGHESHLSLCPVAPRPEGTCSH 

SRDVGWCSRYTEIRLVNGKTPCEGRVELKTLGAWGSLCNSHWD 
I EDAHVLCOOLKCGVAIiSTPGGAR Pnv nnnn twphm j?ur , Tr*'rn»o 

HMGDCPVTALGASLCPSEQVASVICSGNQSQTLSSCNSSSLGPT 
RPTIPEESAVACIESGQLRLVNGGGRCAGRVEIYHEGSWGTICD 
DSWDLSDAHVVCRQLGCGEAINATGSAHFGEGTGPIWLDEMKCN 
GKESRIWQCHSHGWGQQNCRHKEDAGVICSEFMSLRLTSEASRE 
ACAGRLEVFYNGAWGTVGKSSMSETTVGWCRQLGCADKGKINP 
ASLDKAMSIPMWVDNVQCPKGPDTLWQCPSSPWEKRLASPSEET 
W I TCDNK I RLQEG PTS CSGRVE I WHGGS WGTVCDDS WDLDDAQV 
VCQQLGCGPALKAFKEAEFGQGTGPIWLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDISVQKTPQKATTGRSSRQSSFIA 
VG I LGWLLAI FVALFFLTKKRRQRQRLAVS SRGENLVHQIQYR 
EMNS CLNADDLDLMNS S GGHSEPH 


6211 

• 


3761 


387 


IFGMSKLRMVLLEDSGSADFRRHFVNLSPFTITWLLLSACFVT~ 
SSLGGTDKELRLVDGENKCSGRVEVKVQEEWGTVCNNGWSMEAV 
SVICNQLGCPTAIKAPGWANSSAGSGRIWMDHVSCRGNESALWD 
CKHDGWGKHSNCTHQQDAGVTCSDGSNLEMRLTRGGNMCSGRIE 
I KFQGRWGTVCDDNFNI DHAS VI CRQLECGS AVS FSGS SNFGEG 
SGPIWFDDLICNGNESALWNCKHQGWGKHNCDHAEDAGVICSKG 
ADLSIiRLVDGVTECSGRLEVRFQGEWGTICDDGWDSYDAAVACK 
Q LG C PTAVTA I GR VNAS KG FGH I W LDS VS CQGHE P AVWQ CKHHE 
WGKHYCNHNEDAGVTCSDGSDLELRLRGGGSRCAGTVEVEIQRL 
LGKVCDRGWGLKEADVVCRQLGCGSALKTSYQVYSKIQATNTWL 
FLS S CNGNETS LWDCKNWQWGGLTCDH YEEAK I TCS AHRE PRLV 
GGDI PCSGRVEVKHGDTWGS ICDSDFSLEAASVLCRELQCGTW 
S I LGGAHFGEGNGQI WAEE FQCEGHE SHLSLCPVAPRPEGTCSH 
S RDVG WCSR YTE I RLVNGKTPCEGRVELKTLGAWGSLCNS HWD 
IEDAHVLCOOLKCGVALSTPGGAR Pf5K'f5Mf^nTWP«Mi?ur"Tv* , Ti7/^ 

HMGDCPVTALGASLCPSEQVASVICSGKTQSQTLSSCNSSSLGPT 
RPTIPEESAVACIESGQLRLVNGGGRCAGRVEIYHEGSWGTICD 
DSWDLSDAHWCRQLGCGEAINATGSAHFGEGTGPIWLDEMKCN 
GKESRIWQCHSHGWGQQNCRHKEDAGVICSEFMSLRLTSEASRE 
ACAGRLE VFYNGAWGTVGKS SMSETTVGWCRQLGCADKGK INP 
ASLDKAMSIPMWVDNVQCPKGPDTLWQCPSSPWEKRLASPSEET 
WITCDNKIRLQEGPTSCSGRVEIWHGGSWGTVCDDSWDLDDAQV 
VCQQLGCGPALKAFKEAEFGQGTGP I WLNEVKCKGNESSLWDCP 
ARRWGHSECGHKEDAAVNCTDISVOKTPOKATTGR QQPnCQFTA 
VGILGWLLAIFVALFFLTKKRRQRQRLAVSSRGENLVHQIQYR 
EMNSCLNADDLDLMNSSGGHSEPH 


6212 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPFPACHEIGLGAEAGSGPPPAPAARESRSRAMEEEASSPGL 
GCSKPHLEKLTLGITRILESSPGVTEVTIIEKPPAERHMISSWE 
QKNNCVMPEDVKNFYLMTNGFHMTWS VKLDEHI I PLGSMAINS I 
SKLTQLTQSSMYSLPNAPTLADLEDDTHEASDDQPEKPHFDSRS 
VIFELDSCNGSGKVCLVYKSGKPALAEDTEIWFLDRALYWHFLT 
DTFTA YY R LL I THLGL PQWQ YA FTS YG I S PQAKQR VSM YKP I T Y 
NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPS GPS TSS TS KSS SGSGNP TRK 


6213 


1 


1134 


LKWELRPGGAVWGTGRGAGTGAPRSCCCQTNPGPPSSLRRAFRR 
RELPF PACHE IGLGAEAGSG P P PAPAARESRS RAMEEEASS PGL 
GCSKPHLEKLTLGITRILESSPGVTEVTIIEKPPAERHMISSWE 
QKNNCVMPEDVKNFYLMTNGFHMTWSVKLDEHIIPLGSMAINSI 
SKLTQLTQSSMYSLPNAPTLADLEDDTHEASDDQPEKPHFDSRS 
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Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VIFELDSCNGSGKVCLVYKSGKPALAEDTEIWFLDRALYVmFLT " 
DTFTAYYRLL I THLGLPQWQYAFTS YGIS PQAKQRVSM YKP IT Y 
NTNLLTEETDSFVNKLDPSKVFKSKNKIVIPKKKGPVQPAGGQK 
GPSGPSGPSTSSTSKSSSGSGNPTRK 


6214 


2 


460 


HELAPSAIRRAARLGLGPARWQSRAAAFYFVRGFRTGWSFVGWV 
VLGTS AKRTRL F FFLS KMAAS SRAQ VLAL YRAMLR E S KR FS AYN 
YRTYAVRRIRDAFRENKNVKDPVE IQTLVNKAKRDLGVI RRQVH 
IGQLYSTDKLIIENRDMPRT 


6215 


2 


1B49 


FVAGGPRGSGSAAETMPE I RVTPLGAGQDVGRSCILVS I AGKNV 
MLD CQMHMG FNDDRR F PD F S Y I TQNGR LTD FLD C V 1 1 S HFH LDK 
CG AIi P Y FS EM VG YDG P I YMTHPTQAI C P ILLED YR K I AVD KKGE 
AN FFTS QM I KDCMKKWAVHLHQTVQVDDELE I KAYYAGHVLGA 
AMFQIKVGSESWYTGDYNMTPDRHLGAAWIDKCRPNLLITEST 
YATTI RDSKRCRERDFLKKVHETVERGGKVLI PVFALGRAQELC 
ILLETFWERMNLKVPIYF3TGLTEKANHYYKLFIPWTNQKIRKT 
FVQRNMFEFKH I KAFDRAFADNPGPM WFATPGMLHAGQSLQI F 
RKWAGNEKNMVIMPGYCVQGTVGHKILSGQRKLEMEGRQVLEVK 
MQVE YMS FSAHADAKGIMQLVGQAE PES VLLVHGEAKKMEFIiKQ 
KIEQELRVNCYMPANGETVTLPTSPS I PVGISLGLLKREMAQGL 
LPEAKKPRLLHGTLIMKDSNFRLVSSEQALKELGLAEHQLRFTC 
RVHLHDTRKEQETALRVYSHLKSVLKDHCVQHLPDGSVTVESVL 
LQAAAPS EDPGTKVLLVS WTYQDE ELGS FLTS LLKKGLPQAPS 


6216 


11 


393 


QTTRPEPRNSALRQSRSKMAWGVSSVSRLLGRSRPQLGRPMSS 
G AHG E EG S ARM W KTLTF FVAL PGVAVS MLNVYL KS HHGEHE R P E 
FIAYPHLRIRTKPFPWGDGNHTLFHNPHVNPLPTGYEDE 


6217 


9 


1178 


TRVGRGESGLKMEVKPPPGRPQPDSGRRRRRRGEEGHDPKEPEQ 
LRKLFIGGLSFETTDDSLREHFEKWGTLTDCWMRDPQTKRSRG 
FG F VT YS CVE E VDAAMCAR P H KVDGR WE P KRAVS REDS VKPGA 
HLTVKKIFVGGIKEDTEEYNLRDYFEKYGKIETIEVMEDRQSGK 
KRG FAF VT FDDHDT VDK I WQ K YHT I NGHNCEVKKALS KQEMQ S 
AGSQRGRGGGSGNFMGRGGNFGGGGGNFGRGGNFGGRGGYGGGG 
GGSRGSYGGGDGGYNGFGGDGGNYGGGPGYSSRGGYGGGGPGYG 
NQGGG YGGGGG YDG YNEGGN FGGGN YGGGGNYND FGNYSGQQQS 
NYGPMKGGSFGGRS&GSPYGGGYGSGGGSGGYGSRRF 


6Z18 


1305 


906 


SCERRGFIMADDLKRFLYKKLPSVEGLHAIWSDRDGVPVIKVA 
NDNAPEHALRPGFLSTFALATDQGSKLGLSKNKS I I CYYNTYQV 
VQFNRLPLWSFIASSSANTGLIVSLEKELAPLFEEIiRQWEVS 


6219 


2 


890 


AGPGEGAGAGTRCAGAEAEMASAGGEDCESPAPEADRPHQRPFL 
IGVSGGTASGKSTVCEKIMELLGQNEVEQRQRKWILSQDRFYK 
VLTAEQKAKALKGQYNFDHPDAFDNDLMHRTLKNIVEGKTVEVP 
TYDFVTHSRLPETTWYPADWLFEGILVFYSQEIRDMFHLRLF 
VDTDSDVRLSRRVLRDVRRGRDLEQIIiTQYTTFVKPAFEEFCLP 
TKKYADVI I PRGVDNMVAINLI VQHIQDILNGDICKWHRGGSNG 
RSYKRTFSEPGDHPGMLTSGKRSHLESSSRPH 


6220 


0 ? "7 


764 


EQNISLEMSCTI EKALADAKALVERLRDHDDAAESL IEQTTALN 
KRVEAMKQYQEEIQELNEVARHRPRSTLVMGIQQENRQIRELQQ 
ENKELRTSLEEHQSALELTMS KYREQMFRLLMASKKDDPGI I MK 
LKEQHSKIDMVHRNKSEGFFLDASRHILEAPQHGLERRHLEANQ 
NVH 


4221 


98 


916 


RW I WDLNPVSDGLELRPKYNGI LHCLTTI WKLDGLRGLYQGVTP 
NIWGAGLSWGLYFVFYNAIKSYKTEGRAERLEATEYLVSAAEAG 
AMTLCITNPLWVTKTRLMLQYDAWNSPHRQYKGMFDTLVKIYK 
YEGVRGLYKGFVPGLFGTSHGALQFMAYELLKLKYNQHINRLPE 
AQLSTVEYI SVAALSKI FAVAATYPYQWRARLQDQHMF YSGVI 
DVITKTWRKEGVGGFYKGIAPNLIRVTPACCITFWYENVSHFL 
LDLREKRK 
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\~possible nucleotide insertion) 


£222 


2 


2116 


MARELRALLLWGRRLRPLLRAPALAAVPGGKPILCPRRTTAQLG 
PRRNPAWSLQAGRLFSTQTAEDKEEPLHSIISSTESVQGSTSKH 
EFQAETKKLLDIVARSLYSEKEVFIRELISNASDALEKLRHKLV 
S DGQAL P E M E I HLQTN AE KG T I T I QDTG I GMTQE ELVS NLGT I A 
RSGSKAFLDALQNQAEASSKIIGQFGVGFYSAFMVADRVEVYSR 
SAAPGSLGYQWLSDGSGVFE IAEASGVRTGTKI I IHLKSDCKEF 
S SE AR VRD WTK Y S N FVS F PL YLNGRRMNTLQ A I WMMD P KD VRE 
WQHEEFYRYVAQAHDKPRYTLHYKTDAPLNIRSIFYVPDMKPSM 
FDVSRELGSSVALYSRKVLIQTKATDILPKWLRFIRGWDSEDI 
PLNLSRELLQESALIRKLRDVLQQRLIKFFIDQSKKDAEKYAKF 
FEDYGLFMREGIVTATEQEVKEDIAKLLRYESSALPSGQLTSLS 
EYASRMRAGTRNIYYLCAPNRHLAEHSPYYEAMKKKDTEVLFCF 
EQFDELTLLHLREFDKKKLISVETDIWDHYKEEKFEDRSPAAE 
CLSEKETEELMAWNRNVI^SRVTNVKVTLRLD^HPA^^VTVLE^1G 
AARHFLRMQQLAKTQEERAQLLQPTLEINPRHALIKKLNQLRAS 
EPGLAQLLVDQIYENAMIAAGLVDDPRAMVGRLNELLVKALERH 


6223 


3 


715 


DAWARTMAGM VDFQ D E EQ VKS FLENME VE CN YHC Y H E KD PDG C Y 
R L VD YLEG I R KNFDE AAKVL K FNCE ENQHS D S CY KLGAY YVTG K 
GGLTQDLKAAARCFLMACEKPGKKS I AACHNVGLLAHDGQVNED 
GQPDLG KARDY YTRACDGG YTS S CFNLSAM FLQGAPGFPKDMDL 
ACKYS MKACDLGH I WACANAS RM YKLG DG VD KVE AKAE VL KNRA 
QQVHKEQQKGVQPLTFG 


6224 


1 


133 


LRTI SSMAWGPLLLTLLAHCTGS WAQSVLTQPPSVSGARI PHEK 


6225 


3259 


938 


LLS CHRLAI CKLPFS VESRKTVMGPQGARRQAFLAFGDVTVDPT 
QKEWRLLSPAQRALYREVTLENYSHLVSLGILHSKPELIRRLEQ 
GEVPWGEERRRRPGPCAGIYAEHVLRPKNLGLAHQRQQQLQFSD 
QSFQSDTAEGQEKEKSTKPMAFSSPPLRHAVSSRRRNSWEIES 
SQGQRENPTEIDKVLKGIENSRWGAFKCAERGQDFSRKMMVIIH 
KKAHSRQKLFTCRECHQGFRDES.ALLLHQNTHTGEKSYVCSVCG 
RGFSLKANLLRHQRTHSGEKPFLCKVCGRGYTSKSYLTVHERTH 
TGEKPYECQECGRRFNDKSSYNKHLKAHSGEKPFVCKECGRGYT 
N KS YFWH KR I HSGE KP YRCQE CGRG FSN KS HL I THQRTHSGEK 
PFACRQCKQSFSVKGSLLRHQRTHSGEKPFVCKDCERSFSQKST 
LVYHQRTHSGEKPFVCRECGQGFIQKSTLVKHQITHSEEKPFVC 
KDCGRGFIQKSTFTLHQRTHSEEKPYGCRECGRRFRDKSSYNKH 
LRAHLG E KR FF CRDCGRG FTLKPNLT I HQRTHS G E KP FMC KQCE 
KSFSLKANLLRHQWTHSGERPFNCKDCGRGFILKSTLLFHQKTH 
SGEKPFICSECGQGFIWKSNLVKHQLAHSGKQPFVCKECGRGFN 
WKGNLLTHQRTHSGEKPFVCNVCGQGFSWKRSLTRHHWRIHSKE 
KPFVCQECKRGYTSKSDLTVHERIHTGERPYECQECGRKFSNKS 
YYSKHLKRHLREKRFCTGSVGEASS 


6226 


29 


266 


TKVS ELLGGSQRLFFLPLWRRLCRCGLGPRVS PMAGPRVEVDGS 
IMEGGGQSLRVSTGLSWLLSLPWRAQRIRAGRSYA 


£227 


2581 


890 


MSASSLLEQRPKGQGNKVQNGSVHQKDGLNDDDFEPYLSPQARP 
NNAYTAMSDSYLPSYYSPSIGFSYSLGEAAWSTGGDTAMPYLTS 
YGQLSNGEPHFLPDAMFGQPGALGSTPFLGQHGFNFFPSGIDFS 
AWGNWSSQGQSTQSSGYSSNYAYAPSSLGGAMIDGQSAFANETL 
NKAPGMNTIDQGMAALKLGSTEVASNVPKWGSAVGSGSITSNI 
VASNSLPPATIAPPKPASWADIASKPAKQQPKLKTKNGIAGSSL 
PPPPIKHNMDIGTWDNKGPVAKAPSQALVQNIGQPTQGSPQPVG 
QQANNSPPVAQASVGQQTQPLPPPPPQPAQLSVQQQAAQPTRWV 
APRNRGSGFGHNGVDGNGVGQSQAGSGSTPSEPHPVLEKLRSIN 
NYNPKD FDWNLKHGRVFI I KS YSEDD I HRS I KYN I WCSTEHGNK 
RLDAAYRSMNGKGPVYIiLFSVNGSGHFCGVAEMKSAVDYNTCAG 
VW SQD KWKGR FD VR W I FVKD VPNSQLRH I R LENNENKP VTNS RD 
TQEVPLEKAKQVLKIIASYKHTTSIFDDFSHYEKRQ 
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W==Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 


6228 


47 


197B 


GRRCRRRGAVMELAQEARELGCWAVEEMGVPVAARAPESTLRRL 
CLGQGAD I WAY I LQHVHSQRTV KKI RGNLLW YGHQD S PQVRRKL 
ELEAAVTRLRAEIQELDQSLELMERDTEAQDTAMEQARQHTQDT 
QRRALLLRAQAGAMRRQQHTLRDPMQRLQNQLRRLQDMERKAKV 
DVTFGSLTSAALGLEPWLRDVRTACTLRAQFLQNLLLPQAKRG 
SLPTPHDDHFGTSYQQWLSSVETLLTNHPPGHVLAALEHIiAAER 
EAEIRSLCSGDGLGDTEISRPQAPDQSDSSQTLPSMVHLIQEGW 
RTVGVLVSQRSTLLKERQVLTQRLQGLVEEVERRVLGSSERQVL 
ILGLRRCCLWTELKALHDQSQELQDAAGHRQLLLRELQAKQQRI 
LHWRQLVEE TQEQ VR LL I KGNSAS KTRLCRS PGE VLAL VQR KW 
PTFEAVAPQSRELLRCLEEEVRHLPHILLGTLLRHRPGELKPLP 
TVLPSIHQLHPASPRGSSFIALSHKLGLPPGKASELLLPAAASL 
RQDLLLLQDQRSLWCWDLLHMKTSLPPGLPTQELLQIQASQEKQ 
QKENLGQALKRLEKLLKQALERIPELQGIVGDWWEQPGQAALSE 
ELCQGLSLPQWRLRWVQAQGALQKLCS 


6229 


1571 


560 


GPSLLGTRGTPNPARTLQIFFLIIGRRLTGRMAAVDDLQFEEFG ' 
NAATSLTANPDATTVNIEDPGETPKHQPGSPRGSGREEDDELLG 
NDDSDKTELLAGQKKSSPFWTFEYYQTFFDVDTYQVFDRIKGSL 
LPIPGKNFVRLYIRSNPDLYGPFWICATLVFAIAISGNLSNFLI 
HLGEKTYHYVPEFRKVS IAAT 1 1 YAYAWLVP LALWG FLMWRNSK 
VMN IVSYSFLEI VCVYG YSLF I Y I PTAI LW I I PHKAVRW I LVM I 
ALG I SGSLLAMTFW PAVREDNRRVALAT I VT I VLLHMLLS VG CL 
AYF FDAPEMDHLPTTTAT PNQT VAAAKS S 


6230 


1723 


600 


S KMSGRSGKKKMSKLSRSARAGVI FPVGRLMRYLKKGTFKYR I S 
VGAPVYMAAVIEYLAAEILELAGNAARDNKKARIAPRHILLAVA 
NDEELNQLLKGVTIASGGVLPRIHPELLAKKRGTKGKSETILSP 
PPEKRGRKATSGKKGGKKSKAAKPRTSKKSKPKDSDKEGTSNST 
SEDGPGEX3FTILSSKSLVLGQKLSLTQSDISHIGSMRVEGIVHP 
TTAEIDLKEDIGKALEKAGGKEFLETVKELRKSQGPLEVAEAAV 
SQS SGLAAKFVTHCH I PQWGSDKCEEQLEE T I KNCLS AAEDKKL 
KSVAFPPFPSGRNCFPKQTAAQVTLKAISAHFDDSSASSLKNVY 
FLLFDSES1GIYVQEMAKLDAK 


6231 


149 


870 


LIFSSSTMDRS LRNVL WS FG FLLLF T A YGG LQSLQSSLYSEEG 
LGVTALSTLYGGMLLSSMFLPPLLIERLGCKGTIILSMCGYVAF 
SVGNFFAS WYTLI PTS ILLGLGAAPLWSAQCTYLTITGNTHAEK 
AG KRG KDMVNQ Y FG I F FL I FQS SG VWGNL I S S L V FGQT P S QE TL 
PEEQLTSCGASDCLMATTTTNSTQRPSQQLVYTLLGIYTGSGVL 
AVLMIAAFLQPIRDVQRESE 


6232 


3679 


1476 


FVAGTTMAGFWVGTAPLVAAGRRGRWPPQQLMLSAALRTLKHVL 
YYSRQCLMVSRNLGSVGYDPNEKTFDKILVANRGEIACRVIRTC 
KKMG I KT VAIHS DVDAS S VHVKMADEAVCVG PAPTS KS YLNMDA 
IMEAI KKTRAQAVHPGYGFLSENKEFARCLAAEDWFIGPDTHA 
I QAMGDKI ES KL LAKKAE VNT I PG FDG WKDAE E AVR IAR E I G Y 
PVM I KASAGGGGKGMRI AWDDEETRDGFRLSSQEAASSFGDDRL 
LIEKFIDNPRHIEIQVLGDKHGNALWLNERECSIQRRNQKWEE 
APSI FLDAETRRAMGEQAVALARAVKYSSAGTVEFLVDSKKNFY 
FLEMNTRLQVEHPVTECITGLDLVQEMIRVAKGYPLRHKQADIR 
I NG WAVE CR V YAE D P YKS FG L P S IGRL S Q YQE PLHLPG VR VDS G 
IQPGSDISIYYDPMISKLITYGSDRTEALKRMADALDNYVIRGV 
THNIALLREVI INSRFVKGDI STKFLSDVYPDGFKGHMLTKSEK 
NQLLAIASSLFVAFQLRAQHFQENSRMPVIKPDIANWELSVKLH 
DKVHT WASNNGS VF S VE VDG S KLNVTS TWNLAS PLLS VS VDGT 
QRTVQCLSREAGGNMSIQFLGTVYKVNILTRLAAELNKFMIiEKV 
TEDTS S VLRS PMPG VWAVS VKPGDAVAEGQE I CVI EAMKMQNS 
MTAGKTGTVKSVHCQAGDTVGEGDLLVELE 


6233 | 


1 


2654 


HSTRENLNAGNFNFPSEGHLVRSTGPGGSFAKHMVAQCVSPKGP 
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Codon, /»possible nucleotide deletion, 
\=possible nucleotide insertion) 








LACSRT YFFGATHVP YLGGDS KLPKKTEQ I RLLSQ I YAAVI EAV 
LAG I AC YAKTSS LTKAKE VAEQTLGSGLDS FEL I P FKAALRS KM 
T FH IHAVWNQGR I V P LDS E D S LS F VKTACMAVYD I PDLLGGNG C 
LGS WFS ES FLTS Q I LVKE KDGT VTTETS S WLTAAVPRFCS WL 
VEDNEVKLSEKTHQAVRGDESFLGTYLTGGEGAYLYSSNLQSWP 
EEGNVHFFSSGLLFSHCRHGSI I ISKDHMNSISFYDGDST9TVA 
ALLIDFKSSLLPHLPVHFHGSSNFLMIALFPKSKIYQAFYSEVF 
SLWKQQDNSGISLKVIQEDGLSVEQKRLHSSAQKLFSALSQPAG 
EKRSSLKLLSAKLPELDWFLQHFAISSISQEPVMRTHLPVLLQQ 
AEINTTHRIESDKVIISIVTGLPGCHASELCAFLVTLHKECGRW 
MVYRQIMDSSECFHAAHFQRYLSSALEAQQNRSARQSAYIRKKT 
RLL WLQG YTD VI D WQALQTHPDSNVKAS FTIGA I TACVE PMS 
CYMEHRFLFPKCLDQCSQGLVSNWFTSHTTEQRHPLLVQLQSL 
IRAANPAAAFI LAENG IVTRNED I ELILSENSFSS PEMLRSRYL 
MYPGWYEGKLNAGSVYPLMVQICVWFGRPLEKTRFVAKCKAIQS 
SIKPSPFSGNIYHILGKVKFSDSERTMEVCYNTLANSLSIMPVL 
EGPTPPPDSKSVSQDSSGQQECYLVFIGCSLKEDSIKDWLRQSA 
KQKPQRKALKTRGMLTQQEIRSIHVKRHLEPLPAGYFYNGTQFV 
NFFGDKTDFHPLMDQFMNDYVEEANREIEKYNQELEQQEYHDLF 
ELKP 


6234 


1731 


404 


PR VREDMDHKS PGNKGS LVYAG I KS I VKS SLGMVES SRHNWSGL 
D KQ S D I QNLNE ER I LALQL CG W I KKG TD VD VG P FLNS LVQEGE W 
ERAAAVALFNLDIRRAI QI LNEGAS S E KGDLNLNWAMALSGYT 
DEKNSLWREMCSTLRLQLNNPYLCVMFAFLTSETGSYDGVLYEN 
KVAVRDRVAFACKFLS DTQLNR Y I EKLTNEMKEAGNLEG I LLTG 
LTKDGVDLMESYVDRTGDVQTASYCMTiQGSPLDVLKDERVQYWI 
ENYRNLLDAWRFWHKRAEFD I HRS KLDP S SKP LAQVFVS CNFCG 
KSISYSCSAVPHQGRGFSQYGVSGSPTKSKVTSCPGCRKPLPRC 
ALCLINMGTPVSSCPGGTKSDEKVDLSKDKKIAQFNNWFTWCHN 
CRHGGHAGHMLSWFRDHAECPVSACTCKCMQLDTTGNLVPAETV 
QP 


6235 


1 


571 


EKRDHRLPSWPRAALKVPGRGGRVGTTPELAAGGIMATRNPPPQ 
DYESDDDSYEVLDLTEYARRHQWWNRVFGHSSGPMVEKYSVATQ 
IVMGGVTGWCAGFLFQKVGKLAATAVGGGFLLLQIASHSGYVQI 
DWKRVEKDVNKAKRQIKKRANKAAPEINNLIESATEFIKQNIVI 
SSGFVGGFLLGLAS 


6236 


1 


703 


WDQNKGAAAGSGLTLPSLPSARFSAGPPTQRSRPTMSNMEKHLF 
NLKFAAKELSRSAKKCDKEEKAEKAKIKKAIQKGNMEVARtHAE 
NAI RQKNQAVNFLRM S AR VDAVAARVQTAVTMGKVT KS MAG WK 
SMDATLKTMNLEKISALMDKFEHQFETLDVQTQQMEDTMSSTTT 
LTTPQNQVDMLLQEMADEAGLDLNMELPQGQTGSVGTSVASAEQ 
DELSQRLARLRDQV 


6237 


312 


720 


PTAMAEEG I AAGGVMDVNTALQE VLKTALIHDGLARG I REAAKA 
LDKRQAHLCVXiASNCDEPMYVKLVEAJjCAEHQINLIKVDDNKKL 
GEWVGLCKIDREGKPRKWGCSCVWKDYGKESQAKDVIEEYFK 
CKK 


6238 


? 


4666 


EEVPTQESVKWEINVIIKNPEIVFVADMTKNDAPALVITTQCEI 
CYKGNLENSTMTAAIKDLQVRACPFLPVKRKGKITTVLQPCDLF 
YQTTQKGTD PQV I DM S VKS LTLKYS PVI INTM ITI TSAL YTTKE 
TIPEETASSTAHLWEKKDTKTLKMWFLEESNETEKIAPTTELVP 
KGEMIKMNI DS I FI VLEAG IGHRTVPMLLAKSRFSGEGKNWSSL 
INLHCQLELEVHYYNEMFGVWEPLLEPLEIDQTEDFRPWNIjGIK 
MKKKAKMAIVESDPEEENYKVPEYKTVISFHSKDQLNITLSKCG 
LVMLNNLVKAFTE AATGS S ADFVKDLAP FM I LNS LGLT I S VS PS 
DSFSVLNIPMAKSYVLKNGESLSMDYIRTKDNDHFNAMTSLSSK 
LFFILLTPVNHSTADKIPLTKVGRRIiYTVRHRESGVERSIVCQI 
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Amino acid segment containing signal peptide 
(A»Alanine, C-Cysteine, D=Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine , K=Lysine, 
Ii=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X»UnJcnown, *-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








DTVEGSKKVTIRSPVQIRNHFSVPLSVYEGDTLLGtASPENEFN 
I PLGS YRS F I FLKPEDEN YQMCEG IDFEEII KNDGALLKKKCRS 
KNPSKESFLINIVPEKDNLTSLSVYSEDGWDLPYIMHLWPPILL 
RN LLP YK I AY Y I EG I ENS VFTLS EGHSAQ I CTAQLG KARLHLKL 
LDYLNHDWKS EYH I KPNQQDI S FVS FTCVTEMEKTDLD IAVHMT 
YNTGQT WAFHS P Y WMVNKTGRMLQ YKADG I HRKH P PNYKKPVL 
F S FQ PNH F FNNNKVQLM VTDS E LSNQFS I DTVGS HG AVKCKG L K 
MDYQVGVTIDLSSFNITRIVTFTPFYMIKNKSXYHISVAEEGND 
KWLSLDLEQCIPFWPEYASSKLLIQVERSEDPPKRIYFNKQENC 
ILLRLDNELGGIIAEVNLAEHSTVITFLDYHDGAATFLLINHTK 
NEL VQYNQS SLSEIEDSLP PGKAV FYTWAD P VGSRRL KWR CR KS 
HGEVTQKDDMMMPIDLGEKTIYLVSFFEGLQRIILFTEDPRVFK 
VTYESEKAELAEQE I AVALQDVG I S LVNNYTKQEVAY IG I TS SD 
WWETKPKKKARWKPMSVKHTEKLEREFKEYTESSPSEDKVIQL 
DTNVPVRLTPTGHNMKILQPHVIALRRNYLPALKVEYNTSAHQS 
SFRIQIYRIQIQNQIHGAVFPFVFYPVKPPKSVTMDSAPKPFTD 
VSIVMRSAGHSQISRIKYFKVLIQEMDLRLDLGFIYALTDLMTE 
AEVTENTEVELFHKDIEAFKEEYKTASLVDQSQVSLYEYFHISP 
I KLHLS VSLSSGREEAKDSKQNGGLI PVHSLNLLLKS IGATLTD 
VQD W FKLAF FE LNYQ FHTTSDLQS E VI RHYS KQ AI KQM YVL IL 
GLDVLGNPFGLIREFSEGVEAFFYEPYQGAIQGPEEFVEGMALG 
LKAL VGG AVGG LAG AAS KI TGAMAKGVAAMTMDEDYQQKRREAM 
NKQ PAG FREG I TRGG KGLVS G F VSG I TG I VT K P I KGAQKG GAAG 
F FKG VG KG L VGAVAR PTGG 1 1 DMAS S TFQG I KRATET S E VES LR 
PPRFFNEDGVIRPYRLRDGTGNQMLQKIQFYREWIMTHSSSSDD 
DDDDDDDDESDLNH 


6239 


2108 


634 


KPGMAGKGSSGRRPLLLGLLVAVATVHLVICPYTKVEBSFNLQA 
THDLLYHWQDLEQYDHLEFPGWPRTFLGPWIAVFSSPAVYVL 
SLLEMSKFYSQLIVRGVLGLGVIFGLWTLQKEVRRHFGAMVATM 
FC WVTAMQ FHLM F YCTRTL PNVLAL P WLLALAAW LRHE WARFI 
WLSAFAIIVFRVELCLFLGLLLLLALGNRKVSWRALRHAVPAG 
ILCLGLTVAVDSYFWRQLTWPEGKVLWYNTVLNKSSNWGTSPLL 
WYFYSALPRGLGCSLLFIPLGIiVDRRTHAPTVLALGFMALYSLL 
PHKELRFIIYAFPMIjNITAARGCSYLLNNYKKSWLYKAGSLLVI 
GHLWNAAYSATALYVSHFNYPGGVAMQRLHQLVPPQTDVLLHI 
DVAAAQTGVSRFLQVNSAWRYDKREDVQPGTGMLAYTHILMEAA 
PGLLALYRDTHRVIASWGTTGVSLNLTQLPPFNVHLQTKLVLL 
ERLPRPS 


6240 


2202 


1176 


HERGDSLKEPTSIAESSRHPSYRSEPSLEPESFRSPTFGKSFHF 
DPLSSGSRSSSLKSAQGTGFELGQLQSIRSEGTTSTSYKSLANQ 
TRNGSLSYDSLLTPSDSPDFESVQAGPEPDPPLGYTSPFLSARL 
AQQREAERHPRLVPTGPTHREPSPVRYDNLSRHIVASLQEREKL 
LRQSPPLPGREEEPGLGDSGIQSTPGSGHAPRTSSSSDDSKRSP 
LG KT P LGR P AVP R FGKPDG LRGUG VG S P E PG P TAP YLGRS M S YS 
SQKAQPGVSETEEVALQPLLTPKDEVQLKTTYSKSNGQPKSLGS 
ASPGPGQPPLSSPTRGGVKKVSGVGGTTYEISV 


6241 


3 


1341 


RNAEE KKRLS LQRE KI IARVS IDNRTRALVQALRRTTDPKLCI T 
R VEELTFHLLE FPEGKGVAVKERI I P YLLRLRQ I KDETLQAAVR 
E I LAL I G YVDPVKGRGIR I LS I DGGGTRG WALQTLRKLVELTQ 
KPVHQLFDYICGVSTGAILAFMLGLFHMPLDECEELYRKLGSDV 
FSQNVIVGTVKMSWSHAFYDSQTWENILKDRMGSALMIETARNP 
TCPKVAAVSTIVNRGITPKAFVFRNYGHFPGINSHYLGGCQYKM 
WQA IRAS S AAPG YFAE YALGNDLHQDGGLZjLNNPS AIiAMHE CKC 
LWPDVPLECIVSLGTGRYESDVRNTVTYTSLKTKLSNVINSATD 
TEEVHIMLDGLLPPDTYFRFNPVMCENIPLDESRNEKLDQLQLE 
GLKYI ERNEQKMKKVAKI LSQEKTTLQKINDWI KLKTDMYEGLP 
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H=Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, ■ 
W^Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FFSKL 


6242 


198 


1310 


QHFLPGAETWSPGAAVCTARRFPGRSLAAFPRPAAPRRAVEMGE 
SSEDIDQMFSTLLGEMDLLTQSLGVDTLPPPDPNPPRAEFNYSV 
GFKDLNESLNALEDQDLDALMADLVADISEAEQRTIQAQKESLQ 
NQHHSASLQAS I FSGAASLGYGTNVAATGISQ YEDDLPPPPADP 
VLDLPLPPPPPEPLSQEEEEAQAKADKIKIALEKLKEAICVKKLV 
VKVHMNDNSTKSLMVDERQLARDVLDNLPEKTHCDCNVDWCLYE 
IYPELQIERFFEDHENWEVLSDWTRDTENKILFLEKEEKYAVF 
KNPQNFYLDNRGKKESKETNEKMNAKNKESLLEVRLILQSGRKE 
KD V CS I F KS FAS ENNG K I 


6243 


1509 


614 • 


RSASRFSGCWSRDSTCCCCPSTCWSRSSASCPRARWPPSSAPAT 
TS RAS S RRLACG PQTRAGAETRSTAM I RANS AARDTRRATCRSA 
AG T P S P TTMTC LTDV P TGCAAVE P TARLPAAAWAS T I TTGC CPA 
MGQAGAGPAGRKGSEAGGGPGRAHHAHPSPLPREPRVRTGPPAH 
SPTPGSIDPSPELSWGSAGVTQESPLLDPVDFLLFRTRAVDPLR 
RVFFFFYQHLTFFSIQPQPPPCHAFHPRDPPAGTKRQLILVPLK 
GPPILAPILSLTPILSRWSCYFPRSRIAQGWHLS 


6244 


2119 


1745 


FEHAYASQFGTFLGNNESERCKLKLQQKTMSLWSWVNQPSELSK 
FTNPLFEANNLVIWPSVAPQSLPLWEGIFLRWNRSSKYLDEAYE 
EMVNI IEYNKELQAKVNILRRQLAELETEDGMQESP 


6245 


81 


1148 


LSLRNAKYSFPQELISLFSMTDLNDNICKRYIKMITNIVILSLI 
ICISLAFWIISMTASTYYGNliRPISPWRWLFSVWPVLIVSNGL 
KKKSLDHSGALGGL WG FI LTIANFS FFTSLLMFFLSS S KLTKW 
KGEVKKRLDS E YKEGGQRNWVQVFCNGAVPTELALjuYMI ENGPG 
EIPVDFSKQYSASWMCLSLLAALACSAGDTWASEVGPVLSKSSP 
RL I TT WE KV P VGTNGGVT WGLVS S LLGGTFVG I AYFLTQL I FV 
NDLD I SAPQW P 1 1 AFGGLAGLLGS I VDS YLGATMQYTGLDESTG 

MWNSPTNKARHIAGKPILDNNAVNLFSSVLIALLLPTAAWGFW 
PRG 


6246 


1177 


359 


SLWPWILMDDSLMQISLQLLCVYTANFPNGCSSLCWSSCGQHPV 
QATHRGAVSNSLMLCILKLASQMPLENTTVQQMVFMLLSNLALS 
HDCKGVIQKSNFLQNFLSLALPKGGNKHLSNLTILWLKLLLNIS 
SGEDGQQMILRLDGCLDLLTEMSKYKHKSSPLLPLLIFKNVCFS 
PANKPKILANEKVITVLAACLESENQNAQRIGAAALWALIYNYQ 
KAKTALKSPSVKRRVDEAYSLAKKTFPNSEANPLNAYYLKCLEN 
LVQLLNSS 


6247 


3 


1678 


NSRWGPWTEPSAGSLRPMARKQNRNSKELGLVPLTDDTSHAGP 
PGPGRALLECDHLRSGVPGGRRRKDWSCSLLVASLAGAFGSSFL 
YGYNLSWNAPTPYIKAFYNESWERRHGRPIDPDTLTLLWSVTV 
S I FAIGGL VGTL I VKMIG KVLGRKHTLLANNG FAI SAALLMACS 
LQAGAFEMLIVGRFIMGIDGGVALSVLPMYLSEISPKEIRGSLG 
QVTAI FI C IG VFTGQJjLGL PELLGKES TWPYLFGVI WPA WQL 
LSLPFLPDS PRYLLLEKHNEARAVKAFQTFLGKAHVSQEVEE VL 
AESR VQRS I RLVS VLELLRAP YVRWQ WT VI VTMACYQLCGLNA 
I WFYTNS I FGKAGI PPAKI PYVTLSTGGIETLAAVFSGLVIEHL 
GRRPLLIGGFGLMGLFFGTLTITLTLQDHAPWVPYLSIVGILAI 
I AS F C SG PGG I P F I LTGE F FQQ S QR PAAF 1 1 AGT VNWLSNFAVG 
LLFPFIQKSLDTYCFLVFATICITGAIYLVFVLPETKNRTYAEI 
SQAFSKRNKAYPPEEKIDSAVTDGKINGRP 


6248 


5£ 


1773 


VPPPRMMAAVPPGLE PWNRVRI PKAGNRSAVTVQNPGAALDLCI 
AAVIKECHLVILSLKSQTLDAETDVLCAVLYSNHNRMGRHKPHL 
ALKQVEQCZjKRLKNMNLEGS I QDLFELFSSNENQPLTTKVCWP 
SQPWELVLMKVLGACKLLLRLLDCCCKTFLL'IVKHLGLQEFII 
LNLVMVGLVSRLWVLYKGVLKRLILLYEPLFGLLQEVARIQPMP 
YFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLLNKLF 
L I N EQS P RAS EETL LG I S KKAKQM K I NVQNNVDLG Q P VKNKR VF | 
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(A=Alanine, O Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine ( T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








KEESSEFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
VIGTPHAKSFVQRFREAESFTQLSEEIQMAWWCRSKKLKAQAI 
FLGNKLLKSNRL KHLEAQGTSL P KKLEC I KTS I CNHLLRGSG I K 
TS KHHLRQRRSQNKFLRRQR KPQRKLQSTLLRE IQQ FSQGTRKS 
ATDTSAKWRLSHCTVHRTDLYPNS KQLLNSGVSMPVIQTKEKM I 
H E NLRG I HE NETDS WT VMQ I NKN S TS GT I KETDD I D D I FALMG V 


6249 


56 


1773 


VPPPRMMAAVPPGLEPWNRVRIPKAGNRSAVTVQNPGAALDLCI 
AAV I KE CHL VI LS LKS OTIjB A FT DVT ■ CZ VI .V Q mu kt> mp d u v dij t 

ALKQVEQCLKRLKNMNLEGSIQDLFELFSSNENQPLTTKVCVVP 
S Q P WELVLMKVLGACKLLLRLLD CCCKT FLLT VKHLGLQEF 1 1 
LNLVMVGLVS RLWVL YKGVLKRL I LLYBPLFGLLQE VAR IQPM P 
YFKDFTFPSDITEFLGQPYFEAFKKKMPIAFAAKGINKLLNKLF 
LINEQSPRASEETLLGISKKAKQMKINVQNNVDLGQPVKNKRVF 
KEESSEFDVRAFCNQLKHKATQETSFDFKCSQSRLKTTKYSSQK 
VIGTPHAKSFVQRFREAESFTQLSEEIQMAWWCRSKKLKAQAI 
FLGNKLLKSNRLKHLEAQGTSLPKKLiECIKTSICNHLLRGSGIK 
TSKHHLRQRRSQNKFLRRORKPORKLOSTLLRETOOP<3nr r n?vc 
ATDTS A KW RLS HCT VHRTDL YPN S KQLLNS G VS M PV I QT KEKM I 
HENLRGIHENETDSWTVMQINKNSTSGTIKETDDIDDIFALMGV 


6250 


232 


1306 


LAALHIMALPFRKDLEKYKDLDEDELLGNLSETELKQLETVLDD 
LDPENALLPAGFRQKNQTSKSTTGPFDREHLLSYLEKEALEHKD 
REDYVPYTGEKKGKIFIPKQKPVQTFTEEKVSIiDPELEEAIiTSA 
SDTELCDLAAILGMHNLITNTKFCNIMGSSNGVDQEHFSNWKG 
EKILPVFDEPPNPTNVEESLKRTKENDAHLVEVNLNNIKNIPIP 
TLKD FAKALETNTHVKC FS LAATRSNDPVATAFAEMLKVNKTLK 
SLNVESNFITGVGILALIDALRDNETLAELKIDNQRQQLGTAVE 
LEMAKMLEENTNILKFGYQFTQQGPRTRAANAITKNNDLVRKRR 
VEGDHQ 


6251 


62 


972 


TPGSGPMSAWAAASLSRAAARCLIiARGPGVRAAPPRDPRPSHPE 
PRGCGAAPGRTLHFTAAVPAGHNKWS KVRHI KGPKDVERSRI FS 
KLCLNIRLAVKEGGPNPEHNSNLANILEVCRSKHMPKSTIETAL 
KMEKSKDTYLLYEGRGPGGSSLLIEALSNSSHKCQADIRHILNK 
NGGVMAVGARHS FDKKGVI WEVEDREKKAVNLERALEMAIEAG 
AED VKE TEDE E E RNVF KF I CDAS S LHQ VRKKLDS LG LCS VS CAL 
EF I PNS KVQLAE P DLE Q AAHL I Q ALS NH EDV I HVYDNI E 


6252 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQTKRKKPRRYWEE 
ETVPTTAGAS PGP PRNKKNRELRPQRPKNAY I LKKSR I SKKPQV 
PKKPREWKNPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHSKAKTRSRLEVARA.EFFFT*;TTf AAEQFT.T.T 2^ttt?doi?t -eric 

DGEDTAKI CQAD I VE AVD I AS AAKH FDLNLRQFG P YRLNYSRTG 
RHLAFGGRRGHVAALDWVTKKLMCE I NVMEAVRD I RFLHS EALL 
AVAQNRWLH I YDNQGI ELHC I RRCDRVTRLE FLPFHFLLATAS E 
TGFLTYLDVSVGKIVAALNARAGRLDVMSQNPYNAVIHLGHSNG 
TVSLWSPAMKEPLAKILCHRGGVRAVAVDSTGTYMATSGLDHQL 
KI FDLRGT YQ PLSTRTL PHGAGHLAFS QRGLLVAGMGDWNI WA 
GQGKASPPSLEQ P YLTHRLS GP VHGLQ FC P FEDVLG VGHTGGI T 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKALLEKVPAELIC 
LDPRALAEVDVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 
SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6253 


27 


1897 


EEFCTWIAVRVGEMETAPKPGKDVPPKKDKLQTKRKKPRRYWEE 
ETVPTTAGASPGPPRNKKNRELRPQRPKNAYILKKSRISKKPQV 
PKKPREWKNPESQRGLSGAQDPFPGPAPVPVEWQKFCRIDKSR 
KLPHSKAKTRSRLEVAEAEEEETSIKAARSELLLAEEPGFLEGE 
DGEDTAKI CQAD I VE AVD I AS AAKHFDLNLRQFG P YRLNYSRTG 
RHLAFGGRRGHVAALDWVTKKLMCE INVMEAVRD I RFLHS EALL 
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Amino acid segment containing signal peptic 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine / KsLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine r 
W=Tryptophan, Y«Tyrosine, X= Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








AVAQNRWLHIYDNQGIELHCIRRCDRVTRLEFLPFHFLLATASte 
TG FLT YLDVS VGKI VAALNARAGRLDVMS QNP YNAVI HLGHSNG 
TVSLWS PAM KEPLAKI LCHRGGVRAVAVDSTGTYMATSGLDHQL 
KIFDLRGTYQPLSTRTLPHGAGHLAFSQRGLLVAGMGDWNIWA 
GQGKASPPSLEQPYIjTHRLSGPVHGLQFCPFEDVLGVGHTGGIT 
SMLVPGAGEPNFDGLESNPYRSRKQRQEWEVKAIjLEKVPAELIC 
LDPRALAEVDVISLEQGKKEQIERLGYDPQAKAPFQPKPKQKGR 
SSTASLVKRKRKVMDEEHRDKVRQSLQQQHHKEAKAKPTGARPS 
ALDRFVR 


6254 


155 


1139 


HALGRRGGSQELSAAACGCFALRLRAPGSGRPALAPGAAAFAGL " 

GGAPRFPPRGSAAGRTMLLKEYRICMPLTVDEYKIGQLYMISKH 

SHEQSDRGEGVEWQNEPFEDPHHGNGQFTEKRVYLNSKLPSWA 

RAWPKI F YVTEKAWNYYP YT I TEYTCS FLPKFS IHI ETKYEDN 

KGSNDTIFDNEAKDVEREVCFIDIACDEIPERYYKESEDPKHFK 

SEKTGRGQLREGWRDSHQPIMCSYKLVTVKFEVWGLQTRVEQFV 

HKWRDILLIGHRQAFAWVDEWYDMTMDDVREYEKNMHEQTNIK 

VCNQHSSPVDDIESHAQTST 


6255 


1 


1444 


PTRPQQELLVSLATVIFVASQKALSVESKAVIKQQLESVSNGWT 
VYRIARQASRMGNHDMAKELYQSLLTQVASKHFYFWLNSLKEFS 
HAEQCLTGLQEENYSSALSCIAESLKFYHKGIASLTAASTPLNP 
LS FQCEFVKLR I DLLQAFSQL I CTCNSLKTS PPPAIATT IAMTL 
GNDLQRCGRISNQMKQSMEEFRSLASRYGDLYQASFDADSATLR 
NVELQQQSCLLISHAIEALILDPESASPQEYGSTGTAHADSEYE 
RRMMSVYNHVLEEVESLNGKYTPVSYMHTACLCNAIIALLKVPL 
SFQRY FFQKLQSTS I KLAL.S PS PRNP AE P IAVQNNQQLAL KVEG 
WQHGSKPGLFRKIQSVCLNVSSTLQSKSGQDYKIPIDNMTNEM 
EQRVE PHNDYFSTQFLLNFAI LGTHN I TVES S VKDANG I VWKTG 
PRTTIFVKSLEDPYSQQIRLQQQQAQQPLQQQQQRNAYTRF 


6256 


1 


1542 


onvjnvj/aLu- nj~u.v tr sr l\0 UV it oLjljo lol jvccJ\if\j 1 PlA IJJaWALiA 

VDE QEAAAES LSNLHLKEEKI KPDTNGAWKTNANAE KTDE E E K 
EDRAAQSLLNKLIRSNLVDNTNQVEVLQRDPNSPLYSVKSFEEL 
RLKPQLLQdVYAMGFNRPSKIQENALPLMLABPPQNLIAQSQSG 
TGKTAAFVLAMLSQVEPANKYPQCLCLS PTYELALQTGKVI EQM 
GKFYPELKLAYAVRGNKLERGQKISEQIVIGTPGTVLDWCSKLK 
FIDPKKI KVFVLDEADVMI ATQGHQDQS IRIQRMLPRNCQMLLF 
SATFEDSVWKFAQKWPDPNVIKLKREEETLDTIKQYYVLCSSR 
DEKFQALCNL YGA I TIAQAM I FCHTRKTAS WLAAELS KEGHQVA 
LLSGEMMVEQRAAVIERFREGKEKVLVTTNVCARGIDVEQVSW 
I NFDL P VDKDGN PDNET YLH R I GRTGR FG KRGLA VNMVDS KHSM 
NILNRIQEHFNKKIERLDTDDLDEIEKIAN 


6257 


210 


B15 


AFIPAMAELIQKKLQGEVEKYQQLQKDLSKSMSGRQKLEAQLTE 

NNIVKEELiALLDGSNWFKLLGPVLVKQELGEARATVGKRLDYI 

TAEIKRYESQLRDLERQSEQQRETLAQLQQEFQRAQAAKAGAPG 
KA 


6258 


210 


615 


AFIPAMAELIQKKLQGEVEKYQQLQKDLSKSMSGRQKLEAQLTE 
NN I VKE E LALLDG S NW FKLLG PVL VKQE LG E ARATVG KRLD Y I 

TAEIKRYESQLRDLERQSEQQRETLAQLQQEFQRAQAAKAGAPG 
KA 


6259 


2 


1540 


ILEKGFPSQCHPERKWKVDDVLESSQENEDDHFWELLFHNNKTV 
S VENGDRGS KTFNLGTDPVS LRNYP YK I CDS CEMNLKNI SGL 1 1 
SKKNCSRKKPDEFNVCEKLLLDIRHEKIPIGEKSYKYDQKRNAI 
NYHQDLSQPSFGQSFEYSKNGQGFHDEAAFFTNKRSQIGETVCK 
YNECGRTFIESLKLNISQRPHLEMEPYGCSICGKSFCMNLRFGH 
QRALTKDNPYEYNEYGEIFCDNSAFIIHQGAYTRKILREYKVSD 
KTWEKSALLKHQIVHMGGKSYDYNENGSNFSKKSHLTQLRRAHT 
GEKTFECGECGKTFWEKSNLTQHQRTHTGEKPYECTECGKAFCQ 
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SEQ 
ID 
WO : 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

T no a t" i on 
corresponding 
to first 
amino acid 
residue of 

amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 

uiULdiiuL "i-iu, r — rneiiy j. alanine , tj=ijiycine ( 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N»Asparagine, 
P«=Proline, Q=Glutamine , R^Arginine, 
S=Serine, T^Threonine, V=Valine, 

a i yf l ' u v iiail » i»iytuoiuci a— unknown/ " = ocop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KPHLTNHQRTHTGEKPYECKQCGKTFCVKSNLTEHQRTHTGEKP 
Y E CNACG KS FCHRS ALT VHQRTHTGEKP F I CN E CGKS F CVKSNL 
IVHQRTHTGEKPYKCNECGKTFCEKSALTKHQRTHTGEKPYECN 
ACGKTFSQRSVLTKHQRIHTRVKALSTS 


6260 


2081 


1436 


GTGPEIHACAHASARAPGSRAMALRELKVCLLGDTGVGKSS I VW 
RFVEDSFDPNINPTIGASFMTKTVQYQNELHKFLIWDTAGQERF 
KAiaAfMi iKbbAAAI I VYDITKJEETFSTLKJvWVKELRQHGPPNI 
WA I AGNKCDL I DVRE VMERDAKD YADS IHA I F VETSAKNAINI 
NELFIEISRRIPSTDANLPSGGKGFKLRRQPSEPKRSCC 


6261 


3 


1188 


FWYRLGPGTRSRWPRRGSWAASLVPRGPSPAALVTSPCPPDPLR 
S PAC E PCRP D FAPR PALL LRSGPRSAPAVTGKPALKGQ PGP WPG 
MAEVS I DQS KLPGVKEVCRDFAVLEDHTLAHSLQEQE I EHHLAS 
NVQRNRLVQHDLQVAKQLQEEDLKAQAQLQKRYKDLEQQDCEIA 
QE IQEKLAI EAERRRIQEKKDEDIARLLQEKELQEEKKRKKHFP 
EFPATRAYADSYYYEDGGMKPRVMKEAVSTPSRMAHRDQEWYDA 
EIARKLQEEELLATQVDMRAAQVAQDEEIARLLMAEEKKAYKKA 
KEREKSSLDKRKQDPEWKPKTAKAANSKSKBSDEPHHSKNERPA 
RPPPPIMTDGEDADYTHFTNQQSSTRHFSKSESSHKGFHYKH 


6262 


2 


1759 


PECHSQGLCSVHRPGKVPQARMSGLVLGQRDEPAGHRLSQEEIL 
GSTRLVSQGLEALRSEHQAVLQSLSQTIECLQQGGHEEGLVHEK 
ARQLRRSMENIELGLSEAQVMLALASHLSTVESEKQKLRAQVRR 
LCQENQWLRDELAGTQQRLQRSEQAVAQLEEEKKHLEFLGQLRQ 
YDEDGHTSBEKEGDATKDSLDDLFPNEEEEDPSNGLSRGQGATA 
AQQGGYEIPARLRTLHNLVIQYAAQGRYEVAVPLCKQALEDLER 
TSGRGHPDVATMLNILALVYRDQNKYKEAAHLLNDALSIRESTL 
GPDH PAVAATLNNLAVL YGKRGKYKEAEPLCQRALE IREKVLGT 
NHPDVAKQLNWLALLCQNQGKYEAVERYYQRALAI YEGQLG PDN 
PNVARTKNNLAS C YL KQG KYAEAE TLY KE I LTRAHVQE FGS VDD 
DH KP I WMHAEE RE EMS KS RHHEGG TP YAE YGGW Y KACKVS S P TV 
NTTLRNLGALYRRQGKLEAAETLEECALRSRRQGTDPISQTKVA 
ELLGESDGRRTSQEGPGDSVKFEGGEDASVAVEWSGDGSGTLQR 
SGSLGKIRDVLRR 


6263 


1 


240B 


RELDSLADLPERIKPPYANGLSTSHLRSSSVEDVKLIISEGRPT 
IEVRRCSMPSVICEHTKQFQTISEESNQGSLLTVPGDTSPSPKP 
EVFSNVPERDLSNVSNIHSSFATSPTGASNSKYVSADRNLIKNT 
APVNTVMDSPVHLEPSSQVGVIQNKSWEMPVDRLETLSTRDFIC 
PNSN I PDQESSLQS FCN S ENKVL KENAD FLSLRQTEL PGNS CAQ 
DPASFMPPQQPCSFPSQSLSDAESISKHMSLSYVANQEPGILQQ 
KNAVQ 1 1 S S ALDTDNES T KDTENT FVLGD VQ KTDAF V P VYS D S T 
I QE AS PN FE KAYTL P VL P S E KDFNG S DAS TQLNTH YAFS KLT Y K 
SSSGHEVENSTTDTQVISHEKENKLESLVLTHLSRCDSDLCEMN 
AGMPKGNLNEQDPKHCPESEKCLLSIEDEESQQSILSSLENHSQ 
QSTQPEMHKYGQLVKVELEENAEDDKTENQIPQRMTRNKANTMA 
NQSKQILASCTLLSEKDSESSSPRGRIRLTEDDDPQIHHPRKRK 
vsRVPOPVOVQP^T.T^nAK'PK'TrinQT.AaTvnQT.^T riprnDvoopD 

^ v ^ v O rouuUMlvCilS. 1 y^oiinnl V UOlJi\XiUt!i X\^lr lOOnW 

ANPYFEYLHIRKKIEEKRKLLCSVIPQAPQYYDEYVTFNGSYLL 
DGNPLSKICIPTITPPPSLSDPLKELFRQQEWRMKLRLQHSIE 
REKLIVSNEQEVLRVHYRAARTLANQTLPFSACTVLLDAEVYNV 
PLDSQSDDSKTSVRDRFNARQFMSWLQDVDDKFDKLKTCLLMRQ 
QHEAAALNAVQRLEWQLKLQELDPATYKSISIYEIQEFYVPLVD 
VNDDFELTPI 


6264 


143 


1960 


KHRQENNALDMAPEIHMTGPMCLIENTNGELVANPEALKILSAI 
TQPWWAIVGLYRTGKSYLMNKLAGKNKGFSLGSTVKSHTKGI 
WMWC VPH P KKPEHTL VLLDTEGLGD VKKGDNQNDS W I FTLA VLL 
SSTLVYNSMGTINQQAMDQLYYVTELTHRIRSKSSPDENENEDS 
ADFVS FFPDFVWTLRDFSLDLEADGQPLTPDEYLE YS LKLTQGT 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidxne, I=Isoleucine, K=Lysine, 
L=Leucine, [^Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R»Arginine, 
St-Serine, T^Threonine, VoValine, 

" * *■ j tr ^w^wbui , * — lyiuoniCf a— unknown , — o uup 

Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








SQKDKNFNLPRLCIRKFFPKKKCFVFDLPIHRRKLAQLEKLQDE~" 

ELDPEFVQQVADFCSYIFSNSKTKTLSGGIKVNGPRLESLVLTY 

INAISRGDLPCMENAVLALAQIENSAAVQKAIAHYDQQMGQKVQ 

LPAETLQELLDLHRVSEREATEVYMKNSFKDVDHLFQKKLAAQL 

DKKRDD FCKQNQEAS SDRCSALLQV I FS PLEEEVKAG I YS KPGG 

YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 

QTDQILTEKEKEIEVECVKAESAQASAKMVEEMQIKYQQMMEEK 

EKSYQEHVXQLTEKMERERAQLLEEQEKTLTSKLQEQARVLKER 

CQG ESTQLQNE I QKLQ KTLKKKT KR YM S H KL KI 


6265 


143 


1960 


KHRQENNALDMAPE IHMTGPMCLI ENTNGELVANPEALKI LS AI 
TQPWWAIVGLYRTGKSYLMNKLAGKNKGFSLGSTVKSHTKGI 
WNWL Vrnr'A.KPEMrijVIjIjD rECjLGDVKKGDNQNDSWI FTLAVLL 
SSTLVYNSMGTINQQAMDQLYYVTELTHRIRSKSSPDENENEDS 
ADFVSFFPDFVWTLRDFSLDLEADGQPLTPDEYLEYSLKLTQGT 
SQKDKNFNLPRLCIRKFFPKKKCFVFDLPIHRRKLAQLEKLQDE 
ELDPEFVQQVADFCSYIFSNSKTKTLSGGIKVNGPRLESLVLTY 
INAI SRGDLPCMENAVLALAQ I ENS AAVQKAIAHYDQQMGQKVQ 
LPAETLQELLDLHRVSEREATEVYMKNSFKDVDHLFQKKLAAQL 
DKKRDDFCKQNQEASSDRCSALLQVIFSPLEEEVKAGIYSKPGG 
YCLFIQKLQDLEKKYYEEPRKGIQAEEILQTYLKSKESVTDAIL 
yiUylJjl JiKEKEIEVbCVKAESAQASAKMVEEMQIKYQQMMEEK 
EKSYQEHVKQLTEKMERERAQLLEEQEKTLTSKLQEQARVLKER 
CQGESTQLQNEIQKLQKTLKKKTKRYMSHKLKI 


6266 


276 


1421 


GSHQKQMLVPCFLYSLQNRKPSLYGSLTCQGIGLDGIPEVTASE 
GFTVNEINKKSIHISCPKENASSKFLAPYTTFSRIHTKSITCLD 
1 boKufelAj Vbbb LU\j 1MK1WQASNGELRRVLEGHVFDVNCCRFF 
PSGLWLSGGMDAQLKIWSAEDASCWTFKGHKGGILDTAIVDR 
GRN WSASRDGTARLVJDCGRS ACLG VLADCGS S INGVAVGAADN 
S INLGS PEQMP S EREVGTEAKMLLLAREDKKLQCLGLQS RQLVF 
LFIGSDAFNCCTFLSGFLLLAGTQDGNIYQLDVRSPRAPVQVIH 
RSGAPVLSLLSVRDGFIASQGDGSCFIVQQDLDYVTELTGADCD 
P VY KVATWEKQ I YTCCRDGLVRRYQLSDL 


6267 


3 


622 


LGMMKKNNSAKRGPQDGNQQPAPPEKVGWVRKFCGKGIFREIWK 
NRYWLKGDQLYISEKEVKDEKNIQEVFDLSDYEKCEELRKSKS 
RSKKNHSKFTliAHSKQPGNTAPNLIFLAVSPEEKESWINALNSA 
j. i xtrtJUNK XLiUtL V 1 VCiEL/o ILAnf I KJJKAKiyHbRKPPTRGHXiMA 
VASTSTSDGMLTLDLIQEEDPSPEEPTSLC 


6268 


160 


13*8 


HRELCQNLPAGLSSALIDNPLTLLLSIDTYVMLQEPVTFQDVAV 
DFSREEWGLLGPTQRTEYRDVMLETFGHLVSVGWETTLENKELA 
PNSDIPEEEPAPSLKVQESSRDCALSSTLEDTLQGGVQEVQDTV 
LKQMESAQEKDLPQKKHFDNRESQANSGALDTNQVSLQKIDNPE 
SQANSGALDTNQVLLHKI PPRKRLRKRDSQVKSMKHNSRVKIHQ 
KSCERQKAKEGNGCRKTFSRSTKQITFIRIHKGSQVCRCSECGK 
IFKNPRYFSVHKKIHTGERPYVCQDCGKGFVQSSSLTQHQRVHS 
GERPFECOECGRTFNDRSAISOHT.RTHTRBK'PYIfPnnpriVa.PTPn 
S S HL I RHQ RTHTG ER P YACN KCG KAFTQSSHL I GHQRTHNRTKR 
KKKQPTS 


6269 


2886 


1449 


HASAPTRRNMAAASPLRDCHAWKDARLPLSTTSNEACKLFDATL 
TQYVKWTNDKSLGGIEGCLSKLKAADPTFVMGHAMATGLVLIGT 
GSS VKLDKELDLAVKTMVEI SRTQ PL.TRREQLH VSAVETFANGN 
FPKACELWEQILQDHPTDMLALKFSHDAYFYLGYQEQMRDSVAR 
I YPF WTPDI PLS S YVKG I YS FGLMETNFYDQAEKLAKEALS INP 
TDAWS VHTVAH IHEMKAE I KDGLEFMQHSETLWKDSDMLACHNY 
WH WAL YL I EKG E YE AALT I YDTH I L PS LQANDAMLD WDS CS ML 
YRLQMEGVSVGQRWQDVLPVARKHSRDHILLFNDAHFLMASLGA 
HDPQTTQEXjLTTLRDASESPGENCQHLLARDVGLPLCQALVEAE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=«Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








DGNPDRVLELLLPIRYRIVQLGGSNAQRDVFNQLLIHAALNCTS 
SVHKNVARSLLMERDALKPNSPLTERLIRKAATVHLMQ 


6270 


23 


2086 


SVTVTLGSEGDGRPPTYHLEEMEQEPQNGEPAEIKIIREAYKKA - 
FLFVNKGLNTDELGQKEEAKNYYKQGIGHLLRGISISSKESEHT 
GPGWESARQMQQKMKETLQNVRTRLEILEKGLATSLQNDLQEVP 
KLYPEFPPKDMCEKLPEPQSFSSAPQHAEVNGNTSTPSAGAVAA 
PAS LS L P S OS CP AE A P ? AYT P OAAEGH Y^ VS YGTD <?fi P P c; q vfir 

EFYRNHSQPPPLETLGLDADELILIPNGVQIFFVNPAGEVSAPS 
YPGYLR I VR FLDNSLDT VLNR PPGFLQVCDWL YPLVPDRS PVLK 
CTAGAYMFPDTMLQAAGCFVGVVLSSELPEDDRELFEDLLRQMS 
DLRLQANWNRAEEENEFQIPGRTRPSSDQLKEASGTDVKQLDQG 
NKDVRHKGKRGKRAKDTSSEEVNLSHIVPCEPVPEEKPKELPEW 
SEKVAHNILSGASWVSWGLVKGAEITGKAIQKGASKLRERIQPE 
EKPVEVS PAVTKGLYIAKQATGGAAKVSQFLVDGVCTVANCVGK 
ELAPHVKKHG S KLVPE S LKKDKDGKSPLDGAMWAASS VQG FST 
VWQGLECAAKCIVNNVSAETVQTVRYKYGYNAGEATHHAVDSAV 
NVGVTAYNINNIGIKAMVKKTATQTGHTLLEDYQIVDNSQRENQ 
EGAANVNVRGEKDEQTKEVKEAKKKDK 


6271 


32 


1058 


G CG VKTAG M VG REKE LS I H F V PG S C RLVE EE VN I PNRR VLVTGA 
TGLLGRAVHKE FQQNNWHAVGCGFRRAR P KFEQVNLLDSNAVHH 
IIHDFQPHVIVHCAAERRPDWENQPDAASQLNVDASGNLAKEA 
AAVGAFL I Y I S SD YVFDGTNP P YREEDI PAPLNL YGKTKLDGEK 
AVL ENNLGAA VLR I P I L YGE VE KLE ES AVTVM FD JCVQ FSNKS AN 
MDHWQQRFPTHVKDVATVCRQLAE KRMLDPS I KGTFHWSGNEQM 
TKYEMACAI ADAFNLPSSHLRP I TDS P VLGAQRPRJNAQLDCSKL 
ETLGIGQRTPFRIGIKESLWPFLIDKRWRQTVFH 


6272 


1136 


528 


GAVME DAAAPGRTEGVLERQGAP PAAGQGGALVELTPT PGGLAL 
VSPYHTHRAGDPLDLVALAEQVQKADEFIRANATNKLTVIAEQI 
QHLQEQARKVLEDAHRDANLHHVACNIVKKPGNIYYLYKRESGQ 
QYFS I ISPKEWGTSCPHDFLGAYKLQHDLSWTPYEDIEKQDAKI 
SMMDTLLSQSVALPPCTEPNFQGLTH 


6273 


256 


843 


SCPRVSPECRSLGCQVMFSLPLNCSPDHIRRGSCWGRPQDLKIA ' 
SAAWNSKCHPGAGAAMARQHARTLWYDRPRYVFMEFCVEDSTDV 
HVL I EDHR I VFS CKNADG VEL YNE I E F YAKVNS KDSQDKRS S RS 
ITCFVRKWKEKVAWPRLTKEDIKPVWLSVDFDNWRDWEGDEEME 
LAHVEHYAEVRDNTYCVLPT 


6274 


56 


1142 


AAAAMAAAAGGGAGAARSLSRFRGCLAGALLGDCVGSFYEAHDT 
VDLTSVLRHVQSLEPDPGTPGSERTEALY*YTDDTAMARALVQSL 
LAKEAFDEVDMAHRFAQEYKKDPDRGYGAGWTVFKKLLNPKCR 
DVFE PARAQFNGKGS YGNGGAMRVAG I SLAYS SVQDVQKFARLS 
AQLTHAS SLG YNGAI LQALAVHLALQGES SSKHFLKQLLGHMED 
LEGDAQSVLDARELGMEERPYSSRLKKIGEIoLDQASVTREEWS 
ELGNGIAAFESVPTAIYCFLRCMEPDPEIPSAFNSLQRTLIYSI 
SLGGDTDTIATMAGAIAGAYYGMDQVPESWQQSCEGYEETDILA 
QSLHRVFQKS 


6275 


20 


565 


SRRGRARCLARGSRRPVPkPAkTMAFMVKTMVGGQLKNLTGSLG 
GGEDKGDGDKSAAEAQGMSREEYEEYQKQLVEEKMERDAQFTQR 
KAERATLRSHFRDKYRLPKNETDESQIQMAGGDVELPRELAKMI 

EEDTEEEEEKASVLGQLASLPGLNLGSLKDKAQATLGDLKQSAE 
KCHVM 


6276 


797 


97 


TLLPLPPLPDTBGMILLNTGLEGTVAENPVPIVHTPSGNILTLE '" 
SCLQQLATHPGHWGIHLQIAEPAALRPSLALLARLSSLGLLHWP 
VW VGAKI SHGS F S V P GH VAGR E L LTAVAE VF PHVT VAPG W P E E V 
LGSGYREQLLTDMLELCQGLWQPVSFQMQAMLLGHSTAGAIGRL 
LASS PRATVTVEHNPAGGDYAS VRTALLAARAVDRTRVYYRLPQ 
GYHKDLLAHVGRN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptic!e 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E« 
Glutamic Acid, F* Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V= Valine, 
W tryptophan, Y=»Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6277 


4600 


2744 


MAFRTEMGLYYSYFKTIVEAPSFLNGVWMIMNDKLTEYPLVINT 
LKRFNLYPEVILASWYRIYTKIMDLIGIQTKICWTVTIGEGr^p 
TESCEGLGDPACFYVAVIFILNGLMMALFFIYGTYLSGSRLGGL 
VTVLCFFFNHGBCTRVMWTPPLRESFSYPFLVLQMLLVTHILRA 
TKLYRGSLIALCISNVFFMLPWQFAQFVLLTQIASLFAVYWGY 
IDICKLRKI I YIHMISLALCFVLMFGNSMLLTSYYASSLVI I WG 
ILAMKPHFLKINVSELSLWVIQGCFWLFGTVILKYIiTSKIFGIA 
NDAHIGNLLTSKFFSYKDFDTLLYTCAAEFDFMEKETPLRYTKT 
LLLPWLVGFVAIVRKIISDMWGVLAKQQTHVRKHQFDHGELVY 
HALQLLAYTALG I LIMRLKLFLTPHMCVMASLI CSRQLFGWLFC 
KVHPGA I VFA I LAAMS I QGSANLQTQWNI VGEFSNLPQEEL I EW 
IKYSTKPDAVFAGAMPTMASVKLSALRPIVNHPHYEDAGLRART 
KIVYSMYSRKAAEEVKRELIKLKVNYYILEESWCVRRSKPGCSM 
PE I WD VE D P ANAG KT PLCNLL VKDS KPH FTTVFQNS VYKVL EW 
KE 


£278 


3 


823 


ILFRLVLLSLVYLLNSVATEERKPAEVLIVEGQQYAWGTVLLL 
IRI ILE YCQGVDNI PS VTTDMLTRLSDLLKYFNSRSCQLVLGAG 
ALQWGLKTITTKNLALSSRCLQLIVHYIPVIRAHFEARLPPKQ 
YSMLRH FDH IT KD YHDHI AE I S AKLVAI MDS LFDKLLS KYEVKA 
PVPSACFRNICKQMTKMHEAIFDLLPEEQTQMLFLRINASYKLH 

LKKQLSHLNVINDGGPQNGLVTADVAFYTGNLQALKGLKDLDLN 
MAEIWEQKR 


6279 


127 


1687 


GGAMASDGARKQFWKRSNSKLPGSIQHVYGAQHPPFDPLLHGTL " 

LRSTAKMPTTPVKAKRVSTFQEFESNTSDAWDAGEDDDELLAMA 

AESLNSEWMETANRVLRNHSQRQGRPTLQEGPGLQQKPRPEAE 

PPSPPSGDLRLVKSVSESHTSCPAESASDAAPLQRSQSLPHSAT 

VTLGGTSDPSTLSSSALSEREASRLDKFKQLLAGPNTDLEELRR 

LSWSGIPKPVRPMTWKLLSGYLPANVDRRPATLQRKQKSYFAFI 

EHYYDSRNDEVHQDTYRQIHIDIPRMSPEALILQPKVTEIFERI 

LFIWAIRHPASGYVQGINDLVTPFFWFICEYIEAEEVDTVDVS 

GVPAEVLCNIEADTYWCMSKLLDGIQDNYTFAQPGIQMKVKMLE 

ELVSR I DEQVHRHLDQHE VR YLQFAFRWMNNLLMREVPLRCT I R 

LWDTYQSEPDGFSHFHLYVCAAFLVRWRKEILEEKDFQELLLFL 

QNLPTAHWDDEDI SLLLAEAYRLKFAFADAPNH YKK 


6280 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEEEEEDE 
DVDLAQVLAYLLRRGQVRLVQGGGAANLQFIQALLDSEESNDRA 
WDGRLGDRYNPPVDATPDTRELEFNEIKTQVELATGQLGLRRAA 
QKHSFPRMLHQRERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YSQKAFCG I YS KDGQ I FMS ACQDQT I RLYDCR YGRFRKFKS IKA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHICNIYGEGDTHTALD 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
I ES HEDDVNAVAFAD I SSQ I LFSGGDDAICKVWDRRTMREDDPK 
PVGALAGHQDGITFIDSKGDARYLISNSKDQTIKLWDIRRFSSR 
EGMEASRQAATQQNWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH 
\s v jjri iij j. K^ur i>fi.Hfa loyyt IYSGCSTGKVVVYDLLSGHIVKK 
LTNHKACVRDVSWHPFEEKIVSSSWDGNLRLWQYRQAEYFQDDM 
PESEECASAPAPVPQSSTPFSSPQ 


6281 


857 


2515 


ECCDQKMGSRNSSSAGSGSGDPSEGLPRRGAGLRRSEEEEEEDE 
DVDLAQVLAYLLRRGQVRLVQGGGAANLQFIQALLDSEEENDRA 
WDGRLGDR YN P PVDATPDTRELEFNE I KTQVELATGQLGLRRAA 
QKHSFPRMLHQRERGLCHRGSFSLGEQSRVISHFLPNDLGFTDS 
YSQKAFCG I YSKDGQIFMSACQDQT I RLYDCRYGRFRKFKS IKA 
RDVGWSVLDVAFTPDGNHFLYSSWSDYIHICNIYGEGDTHTAIjD 
LRPDERRFAVFSIAVSSDGREVLGGANDGCLYVFDREQNRRTLQ 
IESHEDDVNAVAFADISSQ 1 LFSGGDDAICKVWDRRTMREDDPK 
PVGAIAGHQDGITFIDSKGDARYLISNSKDQTIKLWDIRRFSSR 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, Fephenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine , R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y«Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








EGMEASRQAATQQNWDYRWQQVPKKAWRKLKLPGDSSLMTYRGH 
G VLHTL IRCRFSPIHS TGQQF I YS G CS TG KWVYDLLSGH I VKK 
LTNHKACVRDVSWHPFEEKIVSSSWDGNLRLWQYRQAEYFQDDM 
PESEECASAPAPVPQSSTPFSSPQ 


6282 


125 


906 


RMAACRALKAVLVDLSGTLHIEDAAVPGAQEALKRLRGASVIIR ' ' 
FVTNTTKESKQDLLERLRKLEFDISEDE1FT«;t,TAAP<;t,t xrovn 

VRPMLLVDDRALPDFKGI QTSDPNAWMGLAPEHFHYQ I LNQAF 
RLLLDGAPLIAIHKARYYKRKDGLALGPGPFVTALEYATDTKAT 
WGKPEKTFFLEALRGTGCEPEEAVMIGDDCRDDVGGAQDVGML 
G I LVKTGKYRAS DEEKINP P P YLTCES F PHAVDH I LQHLL 


6283 


140 


1043 


LSLFGIHVMNPFWSMSTSSVRKRSEGEEKTLTGDVKTSPPRTAP 
KKQLPSIPKNALPITKPTSPAPAAQSTNGTHASYGPFYLEYSLL 
AEFTLWKQKLPGVYVQPSYRSALMWFGVIFIRHGLYQDGVFKF 
TVYIPDNYPDGDCPRLVFDIPVFHPLVDPTSGELDVKRAFAKWR 
RNHNHI WQVLM YARRVFYKI DTAS P LN PEAAVL YEKDI QLFKS K 
WDS VK VCTARLFDQPKI EDP YAI S FS PWNPS VHDEAREKMLTQ 
KKKPEEQHNKSVHVAGLSWVKPGSVQPFSKEEKTVAT 


6284 


1 


2879 


RSVIPGSTISSRWPGLSRPRFMAAHEWDWFQREELIGQISDIRV~ 
QNLQVERENVQKRTFTRWINLHLEKCNPPLEVKDLFVDIQDGKI 
LMALLEVLSGRNLLHEYKSSSHRIFRLNNIAKALKFLEDSNVKL 
VS I DAAE I ADGNPS LVLGL I WNI I LFFQI KE LTGNLS RNS PSSS 
LAPGSGGTDSDSS F PPTPTAERS VAI SVKDQRKAI KALLAWVQR 
KTRKYGVAVQDFAGS WRSGLAFLAVI KAIDPS LVDMKQALENST 
RENLEKAFS I AQDALH I PRLLEPED IMVDTPDEQS IMTYVAQFL 
ERFPELEAEDIFDSDKEVPIESTFVRIKETPSEQESKVFVLTEN 
GERTYTVNHETSHP PPS KVFVCDKPESMKEFRLDGVSSHALSDS 
STEFMHQIIDQVLQGGPGKTSDISEPSPESSILSSRKENGRSNS 

LPIKKTVHFFADTYTfnPPr'QVT^rT.ciT flTFf'CDDWR vt?ct r>r\rvtr\T 
urj.ivn.i Knrcftui i r\Urr jun jjv_ r JiVjoirKViu^oLiKQDGHV 

LAVEVAEEKEQKQESSKIPESSSDKVAGDIFLVEGTNNNSQSSS 

CNG ALE S TARHDE E S HS L S P PGENT VMADS FQ I KVNLMTVE ALE 

EGDYFEAIPLKASKFNSDLIDFASTSQAFNKVPSPHETKPDEDA 

EAFENHAEKLGKRSIKSAHKKKDSPEPQVKMDKHEPHQDSGEEA 

EGCPSAPEETPVDKKPEVHEKAKRKSTRPHYEEEGEDDDLQGVG 

EELSSSPPSSCVSLETLGSHSEEGLDFKPSPPLSKVSVIPHDLF 

Y FPH YE VP LAAVLEAY VED P EDLKNE EMDLP P P Pn YM P n t ,n q t? 

E E ADG S QS S S S S S VPGESL P S AS DQ VL YLS RGG VGTTPAS E PAP 

LAPHEDHQQRETKENDPMDSHQSQESPNLENIANPLEENVTKES 

ISSKKKEKRKHVDHVESSLFVAPGSVQSSDDLEEDSSDYS I PSR 

TSHSDSSIYLRRHTHRSSESDHFSLCSVEERSRSG 


6285 


2157 


1331 


S CKTENLLEMWWFQQGLS FLPSALVI WTS AAFI FS Y I TAVTLHH 
I D PAL P Y I S DTG TVAPE KCL FGAMLNI AAVLC I AT I YVR YKQ VH 
ALS PEENVI I KLNKAGLVLG I LS CLGLS I VANFQ KTTLFAAHVS 
GAVLTFGMGSLYMFVQTILSYQMQPKIHGKQVFWIRLLLVIWCG 
VS ALS MLTCS S VLH S GNFGTD LEQKLHWNP EDKG YVLHM I TTAA 
EWSMSFSFFGFFLTYIRDFQKISLRVEANLHGLTLYDTAPCPIN 
NERTRLLSRDI 


6286 


1619 


276 i 


KAGASCCGSANPYVSVGKSCVLLAMAQLQTRFYTDNKKYAVDDV 
PFSIPAASEIADLSNIINKLLKDKNEFHKHVEFDFLIKGQFLRM 
PLDKHMEMENISSEEWEIEYVEKYTAPQPEQCMFHDDWISSIK 
GAEEWILTGSYDKTSRIWSLEGKSIMTIVGHTDWKDVAWVKKD 
SLSCLLLSASMDQTILLWEWNVERNKVKALHCCRGHAGSVDSIA 
VDGSGTKFCSGSWDKMLKIWSTVPTDEBDEMEESTNRPRKKQKT 
EQLGLTRTPIVTLSGHMEAVSSVLWSDAEEICSASWDHTIRVWD 
VESGSLKSTLTGNKVFNCISYSPLCKRLASGSTDRHIRLWDPRT 
KDG S L VS L S LT S HTG WVTS VKWS PTHEQQL I S GSLDN I VKLWDT 
RS CKAPLYDLAAHEDKVLS VDWTDTGLLLSGGADNKLYS YR YS P 
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Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
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\=possible nucleotide insertion) 








TTSHVGA 


6267 


27 8 


1482 


MQFFFNFQIGLRSTSGKEKYSGDAGFLGDALQLFLQCLALDEDF 

APAKLQVQKILCDLLLPENLKEGLKESSWSSLPCTKNRPFDFHS 

VMEESQSLNEPSPKQSEEIPEVTSEPVKGSLNRAQSAQSINSTE 

MPAREDCLKRVSSEPVLSVQEKGVLLKRKLSLLEQDVIVNEDGR 

NKLKKQGETPNEVCMFSLAYGDIPEELIDVSDFECSLCMRLFFE 

P VTTPCGHS FCKNCLERCLDHAP YCPLCKES LKE YLADRR YCVT 

QLLEELIVKYLPDELSERKKIYDEETAELSHLTKNVPIFVCTMA 

YPTVPCPLHVFEPRYRLMIRRSIQTGTKQFGMCVSDTQNSFADY 

GCMLQIRNVHFLPDGRSWDTVGGKRFRVLKRGMKDGYCTADIE 
YLEDV 


6288 


1 


743 


VTLYPCRGLVGNLLLGASGMASGCKIGPSILNSDLANLGAECLR 
MLDSGADYLHLDVMDGHFVPNITFGHPVVESLRKQLGQDPFFDM 
HMMVSKPEQWVKPMAVAGANQYTFHLEATENPGALIKDIRENGM • 
KVGLAI KPGTS VE YT..A P WANQ I DMAL VMTVE PGFGGQKFMEDMM 
P K V H WT.iRTQ F P SLD I E VDGGVG P DTVKKCAE AGANM I VS G SAI M 
RSEDPRSVINLLRNVCSEAAQKRSLDR 


6289 


1 


743 


VTL Y P CRG L VGNLL LGAS GMAS G C KI G PS I LNSDLANLGAECLR 
M LDS GAD YLHLD VMDGHFVPN I TFGHP WES LR KQLGQD PFFDM 
HMMVSKPEQWVKPMAVAGANQYTFHLEATENPGALIKDIRENGM 
KVGLAI KPGTS VEYLAP WANQ I DMALVMTVEPG FGGQKFMEDMM 
PKVHWLRTQFPSLD IE VDGGVG PDTVHKCAE AGANM I VSGS AIM 
RSEDPRSVINLLRNVCSEAAQKRSLDR 


6290 


3 


1856 


TLGR W LLG V Y E T VAP TLACL PR P RLRRRRRRRRRRM I S R Y TRKA ' 

VPQSLELKGITKHALNHHPPPEKLEEISPTSDSHEKDTSSQSKS 

DI TRE S S F TS ADTGNS LS AFPS YTGAG I S TEG S S DFS WG YG ELD 

QNATE KVQTM FTAI DELL YEQKLS VHTKS LOE E COO WT A <? P PHT 

RILGRQIITPSEGYRLYPRSPSAVSASYETTLSQERDSTIFGIR 

GKKLHFSSS YAHKASS I AKSSSFCSMERDEEDS I I VSEGI I EEY 

LAFDHIDIEEGFHGKKSEAATEKQKLGYPPIAPFYCWKEDVLAY 

VFDSVWCKWSCMEQLTRSHWEGFASDDESNVAVTRPDSESSCV 

LSELHPLVLPRVPQSKVLYITSNPMSLCQASRHQPNVNDLLVHG 

MPLQPRNLSLMDKLLDLDDKLLMRPGSSTILSTRNWPNRAVEFS 

TSSLS YTVQSTRRRNPP PRTLHP ISTSHSCAETPRS VEE I LRGA 

RVPVAPDSLSSPSPTPLSRNNLLPPIGTAEVEHVSTVGPQRQMK 

PHGDSSRAQSAVVDEPNYQQPQERLLLPDFFPRPNTTQSFLLDT 

QYRRSCAVEYPHQARPGRGSAGPQLHGSTKSQSGGRPVSRTRQG 
P 


6291 


1732 


602 


LVAKMAS S ASARTPAGKRVINQEELRRLMKE KQRLSTSRKR I E S 
PFAKYNRLGQLSCALCNTPVKSELLWQTHVLGKQHREKVAELKG 
AKEASQGSSASSAPQSVKRKAPDADDQDVKRAKATLVPQVQPST 
SAWTTNFDKIGKEFIRATPSKPSGLSLLPDYEDEEEEEEEEEGD 
GERKRGDASKPLSDAQGKEHSVSSSREVTSSVLPNDFFSTNPPK 
APIIPHSGSIEKAEIHEKWERRENTAEALPEGFFDDPEVDARV 
RKVDAPKDQMDKEWDEFQKAMRQVNTIS5AIVAEEDEEGRLDRQ 
IGEIDEQIECYRRVEKLRNRQDEIKNKLXEILTIKELQKKEEEN 
ADSDDEGELQDLLSQDWRVKGALL 


6292 


1835 


1142 


TCPGAMKMVAPWTRFYSNSCCLCCHVRTGTILLGVWYLI INAW " 
LLILLSALADPDQYNFSSSELGGDFEFMDDANMCIAIAISLLMI 
LICAMATYGAYKQRAAWI I PFFCYQI FDFALNMLVAI TVLI YPN 
SIQEYIRQLPPNFPYRDDVMSVNPTCLVLIILLFISIILTFKGY 
L I S CVWNC YR Y INGRNS SDVL VYVTS NDTT VLLP P YDDATVNGA 
AKEPPPPYVSA 


6293 


2^82 


1035 


FWCTLGTVDVHPIGWCAINSKILVPPRTIHAKFTDWKGYLMKRL 
VGSRTLPVDFHIKMVESMKYPFRQGMRLEWDKSQVSRTRMAW 
DTVIGGRLRLLYEDGDSDDDFWCHMWSPLIHPVGWSRRVGHGIK 
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mserrsdmahhptfrkiycdavpylfkkVravyteggwfeegmk 

LEAI DPLNLGN I CVATVCKVLLDG YLM I CVDGGP STDGLDWFC Y 
HASSHAIFPATFCQKNDIELTPPKGYEAQTFNWENYLEKTKSKA 
APSRLFNMDCPNHGFKVGMKLEAVDLMEPRLICVATVKRWHRL 
LSIHFDGWDSEYDQWVDCESPDIYPVGWCELTGYQLQPPVAAEP 
ATPLKAKEATKKKKKQFGKKRKRIPPTKTRPLRQGSKKPLLEDD 
PQGARKISSEPVPGEIIAVRVKEEHLDVASPDKASSPELPVSVE 
NIKQETDD 


6294 


354 


1814 


AQLTTRGRTVAGGVRWIPSPFPDLELYSCCLGTDRGFPELSHHC ' 

KNVIATASDYDMAEITNIRPSFDVSPWAGLIGASVLWCVSVT 

VFVWSCCHQQAEKKHKNPPYKFIHMLKGISIYPETLSNKKKIIK 

VRRDKDGPGREGGRRNLLVDAAEAGLLSRDKDPRGPSSGSCIDQ 

LPIKMDYGEELRSPITSLTPGESKTTSPSSPEEDVMLGSLTFSV 

DYNFPKKALWTIQEAHGLPVMDDQTQGSDPYIKMTILPDKRHR 

VKTRVLRKTLDP VFDETFTFYG I P YSQLQDL VLHFLVLS FDRFS 

RDDVIGE VMVPLAGVDPSTGKVQLTRDI I KRNIQKCI SRGELQV 

SLSYQPVAQRMTWVLKARHLQKMDIAGLSGNPYVKVNVYYGRK 

RIAKKKTHVKKCTLNPIFNESFIYDIPTDLLPDISIEFLVIDFD 

RTTKNEWGRLILGAHSVTASGAEHWREVCESPRKPVAKWHSLS 
EY 


6295 


2795 


617 


VS SALLTGATS GS DAAKS EGAS AS PLSCTNAVAMDRPDEG P PAK 
TRRLSSSESPQRDPPPPPPPPPLLRLPLPPPQQRPRLQEETEAA 
QVLADMRGVGbG PALPP PP PYVI LEEGGI RAYFTLGAECPGWDS 
TIESGYGEAPPPTESLEALPTPEASGGSLBIDFQWQSSSFGGE 
GALETCSAVGWAPQRLVDPKSKEEAIIIVBDEDEDERESMRSSR 
RRRRRRRRKQRKVKRESRERNAERMESILQALEDIQLDLEAVNI 
KAGKAFLRLKRKF IQMRRPFLERRDLI IQHI PGFWVKAFLNHPR 
ISILINRRDEDIFRYLTNLQVQDLRHISMGYKMKLYFQTNPYFT 
NMVIVKEFQRNRSGRLVSHSTPIRWHRGQEPQARRHGNQDASHS 
FFSWFSNHSLPEADRIAEI IKNDLWVNPLRYYLRERGSRI KRKK 
QEMKKRKTRGRCEWIMEDAPDYYAVEDIFSEISDIDETIHDIK 
I SD FME TTDY FETTDNE I TD I NE N I CDS ENPDHNEVPNNE TTDN 
NESADDHETTDNNESADDNNENPEDNNKNTDDNEENPNNNENTY 
GNNFFKGGFWGSHGNNQDSSDSDNEADEASDDEDNDGNEGDNEG 
S DDDGNE GDNEGS DDDDRD I E Y YE KV I EDFDKDQAD YED V I E 1 1 

SDESVEEEGIEEGIQQDEDIYEEGNYEEEGSEDVWEEGEDSDDS 
DLEDVLQVPNGWANPGKRGKTG 


629S 


727 


1199 


RHCGCDAQGACDSLP PTGTSS PVTARNAI PEARCC VWLLDGTTV 
EAVRPARERLARKELRQKRMQQFSRDSAYSSNKDSTCLLTERDT 
LGTSLQFPSPFSGTISFGSFSDS&IFPLGSQCCLGFQQFSISGK 
KWALIHKRVRLSVFGARWGRIYFGK 


6297 


1 


922 


QRAAAASPSSCGPRGAEYGALMAMEGYWRFLALLGSALLVGFLS 
VI FALVWVLHYREGLGWDGSALEFNWHPVLMVTGFVFIQGIAI I 
VYRLPWTWKCSKLLMKSIHAGLNAVAAILAIISWAVFENHNVN 
NlANMYSLHSWVGLIAVICYLLQLLSGFSVFLLPWAPLSIiRAFL 
MPIHVYSGIVIFGTVIATALMGLTEKLIFSLRDPAYSTFPPEGV 
FVNTLGLIiILVFGALIFWlVTRPQWKRPKEPNSTILHPNGGTEQ 
GARGSMPAYSGNNMDKSDSELNNEVAARKRNIiALDEAGQRSTM 


6298 


3 


985 


SVPLRRLSLSGTLQGAGTTTKMAVARIiAAVAAWVPCRSWGWAAV 
PFGPHRGLSVLLARIPQRAPRWLPACRQKTSLSFLNRPDLPNLA 
YKKLKGKSPGIIFIPGYLSYMNGTKALAIEEFCKSLGHACIRFD 
YSGVGSSDGNSEESTLGKWRKDVLSIIDDLADGPQILVGSSLGG 
WLM LHAA I AR P E KWAL I G VATAADTL VT KFNQ LP VE LKKE VEM 
KGVWSMPSKYSEEGVYNVQYSFIKEAEHHCLLHSPIPVNCPIRL 
LHGMKDD I VPWHTSMQVADRVLSTD VDVI LRKHSDHRMRE KAD I 
QLLVYTI DDL I DKLS T I VN 
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6299 


512 


814 


ECDLEGIMPNVTISLSLPTNGSPLQDILVHPCVTSLDSAILTSS 
S I DAMDDS AFSG PYKFP FTPPLES FNLCFYTSQ VP VPP I LGFYQ 
MKEEEVQLRNNH 


6300 


121 


692 


AAPSCWSQRGVPAAGTPSSPRLLVSRAAAPSAGPWGAWRQGARA 
AQSPFSIPNSSSVPYGSQDSVHSSPEDGGGGRDRPVGGSPGGPR 
LVIGSLPAHLSPHMFGGFKCPVCSKFVSSDEMDLHLVMCLTKPR 
ITYNEDVLSKDAGECAICLEELQQGDTIARLPCLCIYHKGCIDE 
WFEVNRSCPEHPSD 


6301 


616 


284 


GKFVPVNWEPPQPLPFPKYLRCYRCLLETKELGCLLGSDICLTP 
AGSSCITLHKKNSSGSDVMVSDCRSKEQMSDCSNTRTSPVSGFW 
I F S Q YC FLD FCND PQNRG L YTP j 


6302 


490 


745 


IFGFLHLFHMEHSFLLVCALFAHVFFSSSCGSSVALHSDPCLLS'"' 
PVLLNCIiPGDLRPLDELYAQKLKYKAISEELDHALNDMTSL 


6303 


2 


1961 


YWNEYGGGLLWQSWQEKHPGQALSSEPWNFPDTKEEWEQHYSQL 
YWYYLEQFQYWEAQGWTFDASQSCDTDTYTSKTEADDKNDEKCM 
KVDLVS FLSS P IMGDNDSSGTSDKDHS EILDGI SNI KLNS EEVT 
QSQLDSCTSHDGHQQLSEVSSKRECPASGQSEPRNGGTNEESNS 
SGNTNTDPPAEDSQKSSGANTSKDRPHASGTDGDESEEDPPEHK 
PS KL KRS HELD I DENP AS D FDDSG S LLG FK YGSGQK YGG I PN F S 
HRQ VR YLEKNVKL KS K YLDMRRQ I KMKNKHI FFTKESEKP FFFCK 
SKILSKVEKFLTWVNKPMDEEASQESSSHDNGHDASTSCDSEEQ 
DMSVKKGDDLLETNNPEPEKCQSVSSAGELETENYERDSLbATV 
PDEQDCVTQEVPDSRQAETEAEVKKKKNKKKNKKVNGLPPEIAA 
VPELAKYWAQRYRLFSRFDDGIKLDREGWFSVTPEKIAEHIAGR 
VSQSFKCDWVDAFCGVGGNTIQFALTGMRVIAIDIDPVKIALA 
RNNAEVYGIADKIEFICGDFLLLASFLKADWFLSPPWGGPDYA 
TAETFDIRTMMSPDGFEIFRLSKKITNNIVYFLPRNADIDQVAS 
LAGPGGQVEIEQNFLNNKLKTITAYFGDLIRRPASET 


6304 


1 


1438 


HRARVDRSRESPGGDLRHPGRVRRDITLSGHPRLSTQHWLLRE 
DE VGD P GT KDLGHP QHGS P I Q ETQS E WTL VS PLPGS DMAAL P A 
WRATSGLTLWPHTAEGRDLLGAENRALTGGQQAEDPTLASGAYQ 
WPGSVEKLQGSVWCDAETLLSSSRTGGQAPPWLTDHDVQMLRLL 
AQGEWDKARVPAHGQVLQVGFSTEAALQDLSSPRLSQLCSQGL 
CGLIKRPGDLPEVLSFHVDRVLGLRRSLPAVARRFHSPLLPYRY 
TDGGAR P V I WWAP D VQHLS D P D E DQNS LALG WLQ YQALLAHS CN 
WPGQAPCPGIHHTEWARLAIiFDFLLQVHDRLDRYCCGFEPEPSD 
P C VEE RLR E KCRN P AE LRL VH I LVRS S D P S HL V Y I DNAGNLQH P 
EDKLNFRLLEGIDGFPESAVKVLASGCLQNMLLKSLQMDPVFWE 
SQGGAQGLKQVLQTLEQRGQVLLGHIQKHNLTLFRDEDP 


6305 


99 


420 


NMIWRGRSTYRPRPRRSVPPPELIGPMLEPGDEEPQQEEPPTES""' 

RDPAPGQEREEDQGAAETQVPDLEADLQELSQSKTGDECGDGPD 

VQGKILTKSEQFKMPEGR 


6306 


1 


1874 


PTRPSKVKVPHTFLIHSYTRPTVCQACKKLLKGLFRQGLQCKDC 
KFNCHKRCATRVPNDCLGEALINGDVPMEEATDFSEADKSALMD 
E S EDS G V I PGS H S ENALHAS E E EEGEGG KAQS S LG Y I P LMR WQ 
S VRHTTR KS S TTLREG WWH YS NKDTLRKRHYWRLDC KC I TL FQ 
NNTTNR Y YKE IPLSEI LTVES AQNFSLVPPGTNPHCFE I VTANA 
TYFVGEMPGGTPGGPSGQGAEAARGWETAIRQALMPVILQDAPS 
APGHAPHRQASLSISVSNSQIQENVDIATVYQIFPDEVLGSGQF 
GWYGGKHRKTGRDVAVKVIDKLRFPTKQESQLRNEVAILQSLR 
HPGIVNLECMFETPEKVFWMEKLHGDMLEMILSSEKGRLPERL 
TKFIilTQlLVALRHLHFKNIVHCDLKPENVLLASADPFPQVKLC 
D FGFAR 1 1 GEKS FRRS WGTPAYLAPEVLLNQG YNRS LDMWS VG 
VIMYVSLSGTFPFNEDEDINDQIQNAAFMYPASPWSHISAGAID 
LINNLLQVKMRKRYSVDKSLSHPWLQEYQTWLDLRELEGKMGER 
YITHESDDARWEQFAAEHPLPGSGLPTDRDLGGACPPQDHDMQG 
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LAERISVL 


6307 


2136 


589 


CFLLPRGRDPEPPEAGAAAPCAPGAPDMSFRKWRQSKFRHVFG 
QPVKNDQCYEDIRVSRVTWDSTFCAVNPKFLAVIVEASGGGAFL 
VLP LSKTGRIDKAYP T VCGHTG P VLD I DWCPHNDEV I ASGS ED C 
T VM VWQ I P ENG LTS P LTE P VWLEGHTKR VG 1 1 AWH PTARNVLL 
SAG CDN WL I WNVGTAE E L YRLDS LH PDL I YNVS WNHNG S L FC S 
ACKDKS VR 1 1 DP RRGTLVAE RE KAH EGAR PMRA I FLADG KVFTT 
GFSRMSERQLALWDPENLEEPMALQELDSSNGALLPFYDPDTSV 
VYVCGKGDSSIRYFEITEEPPYIHFLNTFTSKEPQRGMGSMPICR 
GLEVS KCE I ARFYKLHERKCEP I VMTVPRKSDLFQDDLYPDTAG 
PEAAL.EAEEWVSGRDADPILISLREAYVPSKQRDLKISRRNVLS 
DS R PAMAPGS S H LG APAS TTTAADAT P SGS LARAGEAG KL E E VM 
QELRALRALVKEQGDRICRLEEQLGRMENGDA 


630B 


2 


1118 


GRPTRPEKMLLSLVLHTYSMRYLLPSWLLGTAPTYVLAWGVWR ' 
LLS A F L P AR F YQALDDRL Y CVY QS MVLF FFEN YTG VQ I LL YGDL 
PKNKENI I YLANHQSTVDW I VADILAI RQNALGHVRYVLKEGLK 
WLPLYGWYFAQHGGIYVKRSAKFNEKEMRNKLQSYVDAGTPMYL 
VIFPEGTRYNPEQTfCVLSASQAFAAQRGLAVLKHVLTPRIKATH 
VAFDCMKNYLDAIYDVTWYEGKDDGGQRRESPTMTEFLCKECP 
KIHIHIDRIDKKDVPEEQEHMRRWLHERFEIKDKMLIEFYESPD 
PERRKRFPGKSVNSKLSIKKTLPSMLILSGLTAGMLMTDAGRKL 
YVNTW I YGTLLGCLW VT I KA 


^309 


220 


563 


LVAEVKEPCSLPMLSVDMENKENGSVGVKNSMENGRPPDPADWA 
VMDVVNYFRTVGFEEQASAFQEQEIDGKSLLLMTRNDVLTGLQL 
KLGPALKIYEYHVKPLQTKHLKNNSS 


6310 


36 


979 


GPRCWKFLILSSVNCETLRIGKAWPQSSGQERYWTPRTHSSASE 
AQRGSLAELNVAAAGLWADCDQPLYDCPMCGLICTNYHILQEHV 
DLHLEENS FQQGMDRVQCSGDLQLAHQLQQEEDRKRRS EESRQE 
I EE FQKLQRQ YGLDNSGG Y KQQQ LRNM E I E VNRG RMP P S E FHRR 
KADMM E S LALG FDDG KTKTS G 1 1 EALHR Y YQNAATDVRRVW L S S 
WDHFHSSLGDKGWGCGYRNFQMLLSSLLQNDAYNDCL.KGMLIP 
CIPKIQSMIEDAWKEGFDPQGASQLIIRLQGTKAWIGACEVYIL 
LTSLRV 


6311 


1 


675 


PVWWNS CEG PRIAAAARTGHG VGRRARLACLGEPRVKAAVMLTL " 
ASKLKRDDGLKGSRTAATASDSTRRVSVRDKLLVKEVAELEANL 
P CTCKVH F PDPNKLH CFQLT VT PD EG Y YQGG KFQ FETE VPDA YN 
MVP PKVKCLTKI WHPNI TETGE I CLSLLREHS IDGTGWAPTRTL 
KDWWGLNSLFTDLLNFDDPLNIEAABHHLRDKEDFRNKVDDYI 
KRYAR 


6312 


213 


1400 


GDELVKREAGMKMLPGVGVFGTGSSARVLVPLLRAEGFTVEALW 
GKTEEEAKQLAEEMNIAFYTSRTDDILLHQDVDLVCISIPPPLT 
RQISVKALGIGKNWCEKAATSVDAFRMVTASRYYPQLMSLVGN 
VLRFLPAFVRMKQLISEHYVGAVMICDARIYSGSLLSPSYGWIC 
DELMGGGGLHTMGTYIVDLLTHLTGRRAEKVHGLLKTFVRQNAA 
I RG I RHVTS DDFC FFQMLMGGG VCS TVTLNFNM PGAFVHE VM W 
GSAGRLVARGADLYGQKNSATQEELLLRDSLAVGAGLPEQGPQD 
VPLLYLKGMVYMVQALRQSFQGQGDRRTWDRTPVSMAASFEDGL 
YMQSWDAIKRSSRSGEWEAVBVLTEEPDTNQNLCEALQRNNL 


6313 


2 


2071 


QRSGAARLAFLPSPFS PACVHRSPLSFHGCWFYFVWFMPLGVL 
FHRRRAHGCTLSCSS FVEQPTAMEAEETMECLQEFPEHHKMILD 
R LNEQREQDR FTD I TL I VDGHH F KAHKAVLAACS KFF Y KF FQE F 
TQE PL VE I EG VS KMAFRHL I EFT YTAKLM I QGE EEAND VWKAAE 
FLQMLEAIKALEVRNKENSAPLEENTTGKNEAKKRKIAETSNVI 
T E S LP S AE S E P VE I EVE I AEGT I E VEDEG I ETL E EVAS AKQS VK 
YIQSTGSSDDSALALLADITSKYRQGDRKGQIKEDGCPSDPTSK 
QVEG I EI VELQLSHVKDLFHCEKCNRS FKLFYHFKEHMKSHSTE 
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ID 

NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

c or re spondi ng 

to first 

amino acid 

residue of 

amino acid 

sequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X«Unknown, *«Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








S FKCE ICNKRYLRESAWKQHLNCYHLEEGGVS KKQRTGKKIHVC 
QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTACQTGVGAKKGRKKLYECQVCNSVFNSWDQFKDHLVIHTGDK 
PNHCTLCDLWFMQGNELRRHLSDAHNISERLVTEEVLSVETRVQ 
TEPVTSMTIIEQVGKVHVLPLLQVQVDSAQVTVEQVHPDLLQDS 
QVHDSHMSELPEQVQVS YLEVGR IQTEEGTEVHVEELHVERVNQ 
MPVEVQTELLEADLDHVTPEIMNQEERESSQADAAEAAREDHED 
AEDLE TKP T VD S E AE KAE NEDRTAL P VLE 


6314 


2 


2071 


QRSGAARLAFLPSPFSPACVHRSPLSFHGCWFYFVWFMPLGVL 
FHRRRAHGCTLSCSSFVEQPTAMEAEETMECLQEFPEHHKMILD 
RLNEQREQDRFTDITLIVDGHHFKAHKAVLAACSKFFYKFFQEF 
TQEPLVEIEGVSKMAFRHLIEFTYTAKLMIQGEEEANDVWKAAE 

RTiOMTiT^ZXT VZH.TTirDKITfPKIQIVOT UC l MTTr , V\IU7iWDVTT\D'PrivTtrT 

c ijyriijrirtJ.ivrtij&VKiNxv.tsJNaAir'ijiibrJ JL luWSIbAKAKlvIAETSNVI 
TESLPSAESEPVEIEVEIAEGTIEVEDEGIETLEEVASAKQSVK 
YIQSTGSSDDSALALLADITSKYRQGDRKGQIKEDGCPSDPTSK 
QVEGIEIVELQLSHVKDLFHCEKCNRSFKLFYHFKEHMKSHSTE 
S FKCE I CNKRYLRES AW KQHLNCYHLEEGGVS KKQRTGKKIHVC 
QYCEKQFDHFGHFKEHLRKHTGEKPFECPNCHERFARNSTLKCH 
LTACQTGVGAKKGRKKLYECQVCNSVFNSWDQFKDHLVIHTGDK 
PNHCTLCDLWFMQGNELRRHLSDAHNISERLVTEEVLSVETRVQ 

QVHDSHMSELPEQVQVS YLEVGRIQTEEGTEVHVEELHVERVNQ 
MPVEVQTELLEADLDHVTPEIMNQEERESSQADAAEAAREDHED 
AEDLETKPTVDSEAEKAENEDRTALPVLE 


6315 


1 


1015 


LGLAVNVVTTLVLI S YCPTATEEAPYWTYLLCALGLFI YQSLDA 
IDGKQARRTNS CSPLGELFDHGCDSLSTVFMAVGAS IAARLGTY 
rune c owf iwnr v c iwuiwyi x vabntiKt 1 oXWUV 1 fijiy XAJjVI 
VF VLS AFGGATM WD YTI P I LE I KLK I LPVLGFLGG VI FSCSNYF 
HVILHGGVGKNGSTIAGTSVLSPGLHIGLIIILAIMIYKKSATD 
VFEKHPCLYI LM FGCVFAKVSQKLWAHMTKSELYLQDTVFIiG P 
GLLFLDQYFNNFIDEYVVLWMAMVISSFDMVIYFSALCLQISRH 
LHLNI FKTACHQAPEQVQVLSSKSHQNNMD 


6316 


1503 


792 


VSAGAGTGf MfifiTTfiTftkVTFRAnPMPW ttwk-h tptc p'nv t nc 
MKESSPSGSKSQRYSGAYGASVSDEELKRRVAEELALEQAKKES 
EDQKRLKQAKELDRERAAANEQLTRAILRERICSEEERAKAKHL 
ARQLE EKDRVL KKQ D AFYKEQLARLEE RS S E FYR VTT EQ YQKAA 
EEVEAKFKRYESHPVCADLQAKILQCYRENTHQTLKCSALATQY 
MHCVNHAKQSMLE KGG 


6317 


102 


839 


P BIAQT S AVLARE KGHLPTMRH EAPMQMAS AQDAR YG Q KDS SDQM 
FD YM F KLL I IGNS S VG KTS FL F R YADD S FTS AFVS T VG I D FKVK 
TVFKNEKRTJCTjOTWr)TAfJOPPYPTTT'TawppAMnT?TT i/ivnTfM 

a v t r\A* iii i\_rs. a, ivi_n^ ±YiU i.t\\j\ldt^i t\l ±± 1J\1 IKurUyiVjr J.Jbrli U J. IN 

E ES FNAVQDWSTQ I KT YSWDNAQVILVGNKCDMEDERVI STERG 
QHLGEQliGFEFFETSAKDNINVKQTFERLVDIICDKMSESLETD 
PAI TAAKQNTRLKETP PPPQPNCAC 


6318 


1765 


733 


PWHPLRTLPLHHPHPRPPRAEGREGADSMSHLPGLEtRREAPPL 
LGPLLS P FPLPAGS WHRQMLRSS LRFP I TNSAGAPCKAAGRMNT 
LAP VRRDR VLAE L P Q CLRKEAALHGH KD FHPRVTCACQEHRTGT 
VGFKISKVIWGDLSVGKTCLINRFCKDTFDKNYKATIGVDFEM 
ERFEVLGIPFSLQLWDTAGQERFKCIASTYYRGAQAIIIVFNLN 
DVASLEHTKQWLADALKENDPSSVLLFLVGSKKDLSTPAQYALM 
E KDALQ VAQ EMKAE YWAVSS LTG ENVR E F FFRVAALT F E ANVLA 
ELEKSGARRIGDWRINSDDSNLYLTASKKKPTCCP 


6319 


88 


717 


AATMRLNQNTLLLGKKWLVPYTSEHVPSRYHEWMKSEELQRLT 
ASEPLTLEQEYAMQCSWQEDADKCTFIVLDAEKWQAQPGATEES 
CMVGDVNLFLTDLEDLTLGE I EVMIAEPSCRGKGLGTEAVLAML 
SYGVTTLGLTKFEAKIGQGNEPSIRMFQKLHFEQVATSSVFQEV 



485 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 

NO: 


Predicted 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
{A~Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








TLRLTVSESEHQWLLEQTSHVEEKPYRDGSAEPC 


6320 


90 


1111 


RPRTGRE KVAMAAVDS FYLLYRE I ARJSCNCTMEALALVGAWYTA 
RKS I TVI CDFYSL I RLHFI PRLGSRADLI KQ YGRWA WSGATDG 
IGKAYAEELASRGLNIILISRNEEKLQWAKDIADTYKVETDII 
VADFSSGRE IYLPI REALKDKD VG I LVNNVGVFYP YPQ Y FTQLS 
EDKLWDIINWIAAASLMVHWLPGMVERKKGAIVTISSGSCCK 
PTPQLAAFSASKAYLDHFSRALQYEYASKGI FVQSLI PFYVATS 
MTAPSNFLHRCS WLVPS PKVYAHHAVSTLG I S KRTTG Y WSHS IQ 
FLFAQYMPEWLWVWGANILNRSLRKEALSCTA 


6321 


141B 


341 


HRKAALGALMAGRLLGKALAAVSLSLALASVTIRSSRCRGIQAF 
RNSFSSSWFHLNTNVMSGSNGSKENSHNKARTSPYPGSKVERSQ 
VPNEKVGWLVEWQDYKPVEYTAVSVLAGPRWADPQISESNFSPK 
FNE KDGH VER KS KNG L YE I ENGR P RNP AGRTG LVGRG LLGRWGP 
NHAADPIITRWKRDSSGNKIMHPVSGKHILQFVAIKRKDCGEWA 
IPGGMVDPGEKISATLKREFGERAT.N^T/lKTciaPtfRPTPPK'T UK 

LFSQDHLVIYKGYVDDPRNTDNAWMETEAVNYHDETGEIMDNLM 
LEAGDDAGKVKWVDINDKLKLYASHSQFIKLVAEKRDAHWSEDS 
E ADC HAL 


6322 


2047 


1083 


NQE I LKNVE S S RTVQ PHFLEFLLS LGWS VDVGRH PGWTGHVSTS 
WS I N CCDDGEGS OOE E V T S S ED T G A 9 T FNf^n Y Y VT . W ni T .TTT T 

AFWPSPVESLTDSLESNISDQDSDSNMDLMPGILKQPSLTLEL 
FPNHTDNLNSSQRLSPSSRMRKLPQGRPVPPLGPETRVSWWVE 
RYDDIENFPLSELMTEISTGVETTANSSTSLRSTTLEKEVPVIF 
IHPLNTGLFRIKIQGATGKFNMVIPLVDGMIVSRRALGFLVRQT 
VINICRRKRLESDSYSPPHVRRKQKITDIVNKYRNKQLEPEFYT 
SLFQEVGLKNCSS 


6323 


1 


656 


PAS TTDGAQE AR VPLDGAF W I PRP PAGS PKGCFACVS KP PALQA " 
P AAP AP E P S AS P PMAP TLF PMES KS S KTDS VRAAGAP P ACKHLA 
EKKTMTNPTTVIEVYPDTTEVNDYYLWSIFNFVYLNFCCLGFIA 
LAYS L KVRD K KLLNDLNG AVE DAKT DRL I N I TRS G LAAS C I MLW 
MALSVIATHRGLRSSASILVAEPHDWNTERPQVTFRERCPAL 


6324 


1 


2061 


EGAGMRRCPCRGSLNEAEAGALPAAARMGLEAPRGGRRRQPGQQ 
RPGPGAGAPAGRPEGGGPWARTEGSSLHSEPERAGLGPAPGTES 
PQAEFWTDGQTEPAAAGLGVETERPKQKTEPDRSSLRTHLEWSW 
SELGTTCLWTETGTDGLWTDPHRSDLQFQPEEASPWTQPGVHGP 
WTE LE THGS OTO P K R VKS W A DMT .WTHnw Q Q C T .rvru D cr a r* d c v t? 

PSADGS WKEL YTDGSRTQQDI EG P WTEP YTDGSQKKQDTEAAR K 
QPGTGGFQIQQDTDGSWTQPSTDGSQTAPGTDCLLGEPEDGPLE 
EPEPGELLTHLYSHLKCSPLCPVPRLIITPETPEPEAQPVGPPS 
RVEGGSGGFS SAS S FDES EDD WAGGGGASDPBDRSGS KPWKKL 
KTVLKYSPFWS FRKHYPWVQLSGHAGNFQAGEDGRILKRFCQC 
EQRSLEQLMKDPLRPFVPAYYGMVLODGOTFNOMFnT.T.AnPF^P 
SIMDCKMGSRTYLEEELVKARERPRPRKDMYEKMVAVDPGAPTP 
E EHAQG AVT K P R YMQ WRE TMS S TSTLG FR I EG I KKADGTCNTN F 
KKTQALEQVTKVLEDFVDGDHVILQKYVACLEELREALEISPFF 
KTHE WGSS LLFVHDHTGLAKVWM I DFGKTVALPDHQTLSHRLP 
WAEGNREDGYLWGLDNMICLLQGLAQS 


6325 


165 


944 


GLRDPFRRKRRLKPQVKMSNYVNDMWPGSPQEKDSPSTSRSGGS 
SRLSSRSRSRSFSRSSRSHSRVSSRFSSRSRRSKSRSRSRRRHQ 
RKYRRYSRSYSRSRSRSRSRRYRERRYGFTRRYYRSPSRYRSRS 
RS RS R SRGRS YCG RA YA I ARGQR YYG FGRTVY P E EHS RWRDRS R 
TRSRSRTPFRLSEKDRMELLEIAKTNAAKALGTTNIDLPASLRT 
VPSAKETSRGIGVSSNGAKPEVSILGLSEQNFQKANCQI 


" 6326 


238 


680 


GEPSPATQQKPSATGAGVLHQHFSSGHIYVLMGLLPPPWTISFT 
VQTTLQPPGGLPAAPVSGRMAFEPVGRDLARRMVPRAGKRTQTL 
GARRVAAQGARPLPEDRRPKSGERLHVTVAPCWEFVLPSVSLTA 
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amino acid 
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amino acid 
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Predicted end 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I = Isoleucine, K=*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PeProline, QeGlutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
V^possible nucleotide insertion) 








QAWGGVGQEASSGVP 


6327 


1 


1337 


SLARLAPAGGSWMPTQQPAAPSrRAPKPSRSLSGSLCALFSDA 
DSGSGMKAELPPGPGAVGREMTKEEKLQLRKEKKQQKKKRKEEK 
GAE PETGS AVS AAQ CQG PTRELPESG I QLGTPRE KVPAGRS KAE 
LRAERRAKQEAERALKQARKGEQGGPPPKASPSTAGETPSGVKR 
LPEYPQVDDLLLRRLVKKPERQQVPTRKDYGSKVSLFSHLPQYS 
RONSLTOFMSIPSSVIHPAMVT?TifiT/W^nRTA7T?fiQMi»'Dr , T»T t d 

ALQQVIQDYTTPPNEELSRDLVNKLKPYMSFLTQCRPLSASMHN 
AIKFLNKEITSVGSSKREEEAKSELRAAIDRYVQEKIVLAAQAI 
S R FAYQKI SNGDVI LVYGCS S LVSR I LQEAWTEGRRFRWWDS 
RPWLEGRHTLRSLVHAGVPASYLLIPAASYVLPEVSTEEKDSKV 
GGEKV 


6328 


1030 


276 


HASAE VTTAAARGLGAMEEEMHTDAKI RAENGTGS S PRGPGCS L 
RHFACEQNLLSRPDGSASFLQGDTSVLAGVYGPAEVKVSKEIFN 
KATLEV I LR P KIGLPGVAEKS RER LIRNTCEAWLGTLH PRTS I 
TWr*QWSDAGSLLACCLNAACMALVDAGVPMRALFCGVACALD 
SDGTLVLDPTSKQEKEARAVLTFALDSVERKLLMSSTKGLYSDT 
ELQQCIiAAAQAASQHVFRFYRESLQRRYSKS 


6329 


3 


2016 


S S EVAAGGG TRSAMAE G S GE WT VS ATGAANG LNNG AGGTS ATT 
SNPLSRKLHKILETRLDNDKEMLEALKALSTFFVENSLRTRRNL 
RGD I ERKS LAI NEE FVS I FKE VKE ELE SIS EDVQAMSNCCQDMT 
SRLQAAKEQTQDLIVKTTKLQSESQKLBIRAQVADAFLSKFQLT 
5DfcMbLJjKfc» I KhGP ITEDFFKALGRVKQIHNDVKVLLRTNQQTA 
G LE I MEQMAL LQETAYE RL YRWAQS EC RTLTQE S C D VS P VLTQA 
M EALQDR P VL Y KYTLD E FGTARRS TWRG F I DALTRGG P GGT PR 
PIEMHSHDPLRYVGDMLAWLHQATASEKEHLEALLKHVTTQGVE 
ENIQEWGHITEGVCRPLKVRIEQVIVAEPGAVLLYKISNLLKF 
YHHT I SG I VGN S ATALLTT I EEMHLLS KKI F FNSLS LHAS KLMD 
KVELPPPDLGPSSALNQTLMLLREVLASHDSSWPLDARQADFV 

OVLSCVTiDPTJinMPTVClA^NITifJTAnMaT'PMVMCT VMMVTTT »T T? 
wugcvijyr ajuynv, i v C2t\ji\ UVj l /VL/rlrt 1 r rJ V IN £> Lj I rlr] 1\ JL i.U/Yur 

EFTDRRLEMLQFQIEAHLDTLINEQASYVLTRVGLSYIYNTVQQ 

HKPEQGSIA1WPNLDSVTLKAAMVQFDRYLSAPDNLLIPQLNFL 

LSATVKEQIVKQSTELVCRAYGEVYAAVMNPINEYKDPENILHR 

SPOOVOTLLS 


6330 


1151 


333 


FFYYTFYENKTFSRKMVAEKETLSLNKCPDKMPKRTKLIiAQQPL ' 

PVHQPHSLVSEGFTVKAMMKNSVVRGPPAAGAFKERPTKPTAFR 

KFYERGDFPIALEHDSKGNKIAWKVEIEKLDYHHYLPLFFDGLC 

EMTFPYEFFARQGIHDMLEHGGNKILPVLPQLIIPIKNALNLRN 

RQVI CVTLKVLQHLWSAEMVGKALVP YYRQ I LP VLN I FKNMNV 

N SG DG I D YSQQKREN I GDL I QETLEAFER YGG ENAF INI KYW P 

TYESCLLN 


6331 


3 


495 


QQGQRVRTRGRRAO^ATPLEGCVDLSYPRTHAALLKVAQMVTL 
LIAFICVRSSLWTNYSAYSYFEWTICDLIMILAFYLVHLFRFY 
RVLTCISWPLSELLHYLIGTLLLLIASIVAASKSYNQSGLVAGA 
IFGFMATFLCMASIWLSYKISCVTQSTDAAV 


6332 1 


1 


878 


.VTESNKFDLVSFIPLLRERIYSNNQYARQFIISWILVLESVPDI 
NLLDYLPE ILDGLFQILGDNGKE I RKMCEWLGEFLKE I KKNPS 
SVKFAEMANILVIHCQTTDDLIQLTAMCWMREFIQLAGRVMLPY 
S SGI LTAVLP CLAYDDRKKS I KEVANVCNQSLMKLVTPEDDELD 
E LR P GQROAE P TPDDALPKQEGTAS GE WTPS LHL TS CRG PR E PD 
VIGVALGPHLSNQDYFMYVTHTIVAATQRSGSSGSPPFCRQDTG 
KLSTMATHSQLVKTGTGLEPRQAVSSSH 


6333 


3 


1467 


TRT PS EAEAGGES PQSC VSAAHSDWTAGKP VSLLAPLI P PRSAG " 
QPLTFSPSGRQPLRSLLVGMCSGSGRRRSSLSPTMRPGTGAERG 
GLMMGHPGMHYAPMGMHPMGQRANMPPVPHGMMPQMMPPMGGPP 
MGQMPGMMSSVMPGMMMSHMSQASMQPALPPGVNSMDVAAGTAS 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid; E= 
Glutamic Acid, F» Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P^Proline, Q=G1 ut amine , R=Arginine, 
S=Serine, T=Threonine , V=Valine r 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GAKSMWTEHKSPDGRTYYYNTETKQSTWEKPDDLKTPAEQLLSK 
CPWKEYKSDSGKPYYYNSQTKESRWAKPKELEDLEGYQNTIVAG 
SLITKSNLHAMIKAEESSKQEECTTTSTAPVPTTEIPTTMSTMA 
AAE AAAAWAAAAAAAAAAAAANANASTS ASNT VSGTVP VVPB P 
EVTS I VAT WDNENTVT I STEEQAQLTSTPAIQDQS VEVS SNTG 
EETSKQETVADFTPKKEEEESQPAKKTYTWNTKEEAKQAFKELL 

KEKRVPSNASWEQAMKMIINDPRYSALAKLSEKKQAFNAYKVQT 
EKK 


6334 


. 17 


644 


GGNPSGRAAGFAAAAMPSSPIjRVAWCSSNQNRSMEAHNILSKR 
GFSVRSFGTGTHVKLPGPAPDKPNVYDFKTTYDQMYNDLLRKDK 
ELYTQNGILHMLDRNKRIKPRPERFQNCKDLFDLILTCEERVYD 
QWEDLNSREQETCQPVHWNVDIQDNHEEATLGAFLICELCX2C 
IQHTEDMENBIDELLQEFEEKSGRTFLHTVCFY 


6335 


82 


529 


AARAR PG VL CCR LLGAALGDQS R VE M S Y I PGQ P VTA WQR VE IH 
KLRQGENLILGFSIGGGIDQDPSQNPFSEDKTDKGIYVTRVSEG 
GPAEIAGLQIGDKIMQVNGWDMTMVTHDQARKRIjTKRSEEWRL 
LVTRQS LQKAVQQSMLS 


6336 


1003 


438 


HEPASKGRAEVGNMRLSVAAAISHGRVFRRMGLGPESRIHLLRN 
LLTGLVRHERIEAPWARVDEMRGYAEKLIDYGKLGDTNERAMRM 
AD FWLTEKDL I PKL FQVLAPRYKDQTGGYTRMLQ I PNRS LDRAK 
MAVIEYKGNCLPPLPLPRRDSHLTLLNQLLQGLRQDLRQSQEAS 
NHSSHTAQTPGI 


6337 


76 


524 


EG I QM LS VQ PDT KP KG CAGCNRK I KDR YLLKALDKYWH E D CLKt 
ACCDCRLGEVGSTLYTKANLILCRRDYLRLFGVTGNCAACSXLI 
PA F E M VM RAKDNVYHLD C FACQLCNQR FCVGDKF FLKNNM I LCQ 
TDYEEGLMKEGYAPQVR 


6338 




1349 


APNSESGTQGPLPTPANLFWTRRANPDPTTSMSATDRMGPKAVP 
GLRLALLLLLGLGTPKSGVQGQEGLDFPEYDGVDRVINVNAKNY 
KNVFKKYEVLALLYHEPPEDDKASQRQFEMEELILELAAQVLED 
KGVGFGLVDSEKDAAVAKKLGLTEVDSMYVFKGDEVIEYDGEFS 
ADTIVEFLLDVLEDPVELIEGERELQAFENIEDEIKLIGYFKSK 
DS EH YKAFEDAAE E FHPY I PFFATFDS KGAKKLTLKLNE IDFYE 
AFMEE P VTI PDKPNS EEE I VNFVEEHRRSTLRKLKPESM YETWE 
DDMDG IH I VAFAE EADPDG FE FLETLKAVAQDNTENPDLS 1 1 W I 
DPDDFPLLVPYWEKTFDIDLSAPQIGWNVTDADRLWMEMDDEE 
DLPSAEELEDWLEDVLEGEINTEDDDDDDDD 


6339 


246 


1813 


NRCDRGGGGQAERQAGQGCRTQGAGPGFGFGHSFFSQGAMKAFH 
TFCWLLVFGSVSEAKFDDFEDEEDIVEYDDNDFAEFEDVMEDS 
VTESPQRVIITEDDEDETTVELEGQDENQEGDFEDADTQEGDTE 
S E PYDDEE FEGYEDKPDTS S S KNKDP I TI VDVPAHLQNSWES YY 
LE I LMVTGLLAY IMNY I IGKNKNS RLAQAWFNTHRELLESNFTL 
VGDDGTNKEATSTGKLNQENEHIYNLWCSGRVCCEGMLIQLRFL 
KRQDLLNVLARMMRP VSDQVQ I KVTMNDEDMDT YVFAVGTRKAL 
VRLQKEMQDLSEFCSDKPKSGAKYGLPDSLAILSEMGEVTDGMM 
DTKMVHFLTHYADKIESVHFSDQFSGPKIMQEEGQPLKLPDTKR 
TLLLTFNVPGSGNTYPKDMEALLPLMNMVIYSIDKAKKFRLNRE 
GKQKADKNRARVEENFLKLTHVQRQEAAQSRREEKKRAEKERIM 
NEEDPEKQRRLEEAALRREQKKLEKKQMKMKQIKVKAM 


^340' 


2 


583 


EACAHTLSCPAFARLGRARRRPWMSHRTSSTFRAERSFHSSSSS 
SSSSTSSSASRALPAQDPPMEKALSMFSDDFGSFMRPHSEPLAF 
PARPGGAGNIKTLGDAYEFAVDVRDFSPEDIIVTTSNNHIEVRA 
EKLAADG7VMNNFAHKCQLPEDVDPTSVTSALREDGSLTIRARR 
H PHTEH VQQTFRTE I K I 


6341 


2 


&45 


KMAVLSAPGLRGFRILGLRSSVGPAVQARGVHQSVATDGPSSTQ 
PALPKARAVAPKPSSRGEYWAKLDDLVNWARRSSLWPMTFGLA 
CCAVEMMHMAAPRYDMDRFGWFRASPRQSDVMIVAGTLTNKMA 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E*= 
Glutamic Acid, F=Phenylalanine, G«Glycine, 
H«Histidine, I^Isoleucine, K=Lysine, 
L«Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PALRKVYDQMPEPRYWSMGSCANGGGYYHYSYSVVRGCDRIVP 
VDIYIPGCPP TAEALL YG I LQLQRK I KRERRLQ I W YRR 


6342 


2 


1191 


DPRVRAMLATLARVAALRKTCLFSGRGGGRGLWTGRPQSDMNNI 
KPLEGVKI LDLTRVLAG PFATMNLGDLGAEVI KVERPGAGDDTR 
TWGPPFVGTESTYYLSVNRNKKS IAVNIKDPKGVKIIKELAAVC 
D VFVENYVPGKLS AMGLG YED IDE IAPHI I YCS ITG YGQTGP I S 
QRAG YDAVASAVSGLMH I TGPEVACLSHI AANYLIGQKEAKRWG 
TAHGSIVPYQAFKTKDGYIWGAGNNQQFATVCKILDLPEIilDN 
SKYKTNHLRVHNRKELIKILSERFEEELTSKWLYLFEGSGVPYG 
PINNMKNVFAEPQVLHNGLVMEMEHPTVGKISVPGPAVRYSKFK 
MSEARPPPLLGQHTTHILKEVLRYDDRAIGELLSAGWDQHETH 


6343 


2 


936 


GTAMVSDEDEL^LLVIWDANPIWWGKQALKESQFTLSKCIDAV 
M VLGNS HL FMNRSNKLAV IAS H I Q E S RFL YPG KNGRLGD F FGD P 
GNPPEFNPSGSKDGKYELLTSANEVIVEEIKDLMTKSDIKGQHT 
ETLLAGSLAKALCYIHRMNKEVKDNQEMKSRILVIKAAEDSAK3 
YMN FMNV I FAAQ KQN I L I D AC VL DS D SGLLQQACD I TGG L YLKV 
PQMPSLLQYLLWVFLPDQDQRSQLILPPPVHVDYRAACFCHRNL 
I E I G YVCS VCLS I FCNFS P I CTTCETA FK I S L P P VLKAKKKKLK 
VSA 


6344 


2508 


147 


TM PTATLGNLRG YGMAS PG LAAP S LTP P QLATPNLQQFF P Q ATR 
QSLLGPPPVGVPMNPSQFNLSGRNPQKQARTSSSTTPNRKDSSS 
QTMPVEDKSDPPEGSEEAAEPRMDTPEDQDLPPCPEDIAKEKRT 
PAPEPEPCEASELPAKRLRSSEEPTEKEPPGQLQVKAQPQARMT 
VPKQTQTPDLLPEALEAQVLPRFQPRVLQVQAQVQSQTQPRIPS 
TDTQVQPKLQKQAQTQTSPEHLVLQQKQVQPQLQQEAEPQKQVQ 
PQVQPQAHSQGPRQVQLQQEAEPLKQVQPQVQPQAHSQPPRQVQ 
LQLQKQVQTQTYPQVHTQAQPSVQPQEHPPAQVSVQPPEQTHEQ 
PHTQPQVSLLAPEQTPWVHVCGLEMPPDAVEAGGGMEKTLPEP 
VGTQVSMEEIQNESACGLDVGECENRAREMPGVWGAGGSLKVTI 
LQSSDSRAFSTVPLTPVPRPSDSVSSTPAATSTPSKQALQFFCY 
ICKASCSSQQEFQDHMSEPQHQQRLGEIQHMSQACLLSLLPVPR 
DVLETEDEEPPPRRWCNTCQLYYMGDLIQHRRTQDHKIAKQSLR 
PFCTVCNRYFKTPRKFVEHVKSQGHKDKAKELKSLEKEIAGQDE 
DHFI TVDAVGCFEGDEEEEEDDEDEEE I EVEEELCKQVRSRDIS 
REEWKGSETYSPNTAYGVDFLVPVMGYICRICHKFYHSNSGAQL 
SHCKSLGHFENLQKYKAAKNPSPTTRPVSRRCAINARNALTALF 
TSSGRPPSQPNTQDKTPSKVTARPSQPPLPRRSTRLKT 


*345 


2 


3483 


PRVRTKLILLVNDKKRYERVGGGPKRLGRDVEMEEMIEQLQEKV 
HELEKQNDTLKNRLISAKQQLQTQGYRQTPYNNVQSRINTGRRK 
ANENAGLQECPRKGIKFQDADVAETPHPMFTKYGNSLLEEARGE 
IRNLENVIQSQRGQIEELEHLAE I LKTQLRRKENE I ELSLLQLR 
EQQATDQRSNIRDNVEMIKLHKQLVEKSNALSAMEGKFIQLQEK 
Q RTLK I S HDALMANG D ELNM QL KE QRLKCCS LEKQ LH S MKFS ER 
RI EELQDR INDLEKERELLKENYDKLYDSAFSAAHEEQWKLKEQ 
QLKVQIAQLETALKSDLTDKTEILDRLKTERDQNEKLVQENREL 
QLQYLEQKQQLDELKKRIKLYNQENDINADELSEALLLIKAQKE 
QKNGDLS FLVKVDS E INKDLERSMRELQATHAETVQELSKTRNM 
L IMQHKINKD YQME VEAVTRKMENLQQDYE LKVEQ YVHLLD I RA 
ARIHKLEAQLKDIAYGTKQYKFKPEIMPDDSVDEFDETIHLERG 
ENLFEIHINKVTFSSEVLQASGDKEPVTFCTYAFYDFELQTTPV 
VRGLHPEYNFTSQYLVHVNDLFLQYIQKNTITLEVHQAYSTEYE 
TIAACQLKFHEILEKSGRIFCTASLIGTKGDIPNFGTVEYWFRL 
RVPMDQAIRLYRERAKALGYITSNFKGPEHMQSLSQQAPKTAQL 
SSTDSTDGNLNELHITIRCCNHLQSRASHIiQPHPYWYKFFDFA 
DHDTAI IPS SNDPQFDDHM YFP VPMNMDLDRYLKSES LS FYVFD 
DSDTQENIYIGKVNVPLISLAHDRCISGIFELTDHQKHPAGTIH 
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Ammo acid segment containing signal peptide 
(A=*Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








VILKWKFAYLPPSGSITTEDLGNFIRSEEPEVVQRLPPASSVST 
L VLAPRPKPRQRLTPVDKKVS FVD I M PHQS D VS QEGS VDEVKEN 
TE KMQQG KDD VS L LS EGQLAEQS LAS S EDETE I TE DLE PE VE ED 
MSASDSDDCIIPGPISKNIKQPSEKiRIEIIALSLNDSQVTMDD 
T I QRL F VE CR F YS LPAEE T P VS L P KP KSGQ W VY YNYSNV I YVDK 
ENNKAKRDILKAILQKQEMPNRSLRFTWSDPPEDEQDLECEDI 
GVAHVDLADMFQEGRDLIEQNIDVFDARADGEGIGKLRVTVEAL 
HALQSVYKQYRDDLEA 


6346 


2921 


533 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSALTPSIWPQEIL 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
GIPHGMRPQLWMRLSGALQKKRNSELSYREIVKNSSNDETIAAK 
Q I EKDLLRTMPSNACFASMGS IG VP RLRR VLRALAWL YP E IG YC 
QGTGMVAACLLLFLEEEDAFWMMSAIIEDLLPASYFSTTLLGVQ 
TDQRVLRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAFASVV 

FNTLSDIPSQME DAE LLLG VAMRLAG S LTDVAVE TQRRKHLAYL 
IADQGQLLGAGTLTWLSQWRRRTQRRKSTI TALL FGEDD LEAL 
KAKN I KQTELVADLR E A I LR VARHFQCTDPKNCS WSRQLPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGL 
RGWFPAKFVEVLDERSKEYS IAGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVQSVWVTHDAVHAQMDVKLRSL 
I CVGLNEQVLHLWLE VLCS SL PTVEKWYQPWS FLRS PGWVQI KC 
ELRVLCCFAFSLSQDWELPAKREAQQPLKEGVRDMLVKHHLFSW 
DVDG 


6347 


2921 


533 


QDRRLLRLELQKTCQPTSTMSGSHTPACGPFSALTPSIWPQEIL 
AKYTQKEESAEQPEFYYDEFGFRVYKEEGDEPGSSLLANSPLME 
DAPQRLRWQAHLEFTHNHDVGDLTWDKIAVSLPRSEKLRSLVLA 
GIPHGMRPQLWMRLSGALQKKRNSELSYREIVKNSSNDETIAAK 
QI EKDLLRTMPSNACFASMGS I GVPRLRRVLRALAWLYPEIGYC 
QGTGM VAACLLLFLEEEDAFWMMSAI I EDLLPAS YFSTTLLGVQ 

TDQRVLRHLIVQYLPRLDKLLQEHDIELSLITLHWFLTAFASW 
DIKLLLRIWDTjFFYPfiQPvrr.FnT tt/"MT ut vcptjt TneDKrci\oT 

FNTLSDIPSQMEDAELLLGVAMRLAGSLTDVAVETQRRKHLAYL 
I ADQGQLLGAGTLTNLS Q WRRR TQR RKS T I TAL L FG EDDLEAL 
KAKN I KQTELVADLREAI LRVARHFQCTDPXNCS WSRQLPGLL 
PNTALTPPTPLVGLYSLWQELTPDYSMESHQRDHENYVACSRSH 
RRRAKALLDFERHDDDELGFRKNDIITIVSQKDEHCWVGELNGL 
RGWFPAKFVEVLDERSKEYS IAGDDSVTEGVTDLVRGTLCPALK 
ALFEHGLKKPSLLGGACHPWLFIEEAAGREVERDFASVYSRLVL 
CKTFRLDEDGKVLTPEELLYRAVQSVNVTHDAVHAQMDVKLRSL 
I C VGLN E QVLHLWLE VLCS S L PT VE KW YQ P WS FLRS PGWVQ I KC 
E LR VL C C FAFS LS QDWEL P AKREAQQ P LKEG VRDM L VKHHLFS W 
DVDG 


6348 


3 


3679 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLIKSMLRNELQFKEE 
KLAE QLKQAE E LRQ YKVLVHS Q ERELTQLR E KLREGRDAS RS LN 
EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLSPENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
I L P VPG P TS S ATNVS MWS AG PLS GE KAA IN I L E INE KLR PQLA 
EKKQQFRNLKEKCFLTQLACFLANQQNKYKYEECKDLIKFMLRN 
E RQ F KE E KLAEQLKQAE ELRQ Y KVL VHS QERE LTQLRE KLREGR 
DASRSLNEHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQK 
LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE | 
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ID 
NO; 
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beginning 
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location 
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to first 
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residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C«Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R-Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /«=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 
EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 
YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 
KEDHEATGPRLSRELLDEKGPBVLQDSLDRCYSTPSGCLELTDS 
CQP YRS AFY VLEQQR VGLAVNMDE I EKYQEVEEDQDPS C PRLSR 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 
OYLGLALDVDRIKKDOEEEEDOCJPPPPRT.<;RTrT.T.B , \n/c , DT?\7T nn 

SLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKHVGFSLDVGEIE 

KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 

DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 

TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 

DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 

PYSSAVYSLEEQYLGLALDVDRIKKDQEEEEDQGPPCPRLSREL * 

LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 

VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 

RLNSMLMEVEE PEVLQDSLD I CYSTPSMYFELPDS FQHYRS VFY 

SFE EEH I S FAL YVDNR FFTLTVTS LHLVFQMG V I FPQ 


6349 


3 
• 


3679 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLIKSMLRNELQFKEE ' 

KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 

EHLQALLTPDEPDKSQGQDLQEQLAEGCRLAQHLVQKLSPENDN 

DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 

NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 

ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 

EKKQQFRNLKEKCFLTQIACFLANQQNKYKYEECKDLIKFMLRN 

ERQFKEEKLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGR 

DAS R S LNEHLQALLT PDEPD KSQGQD LQE QLAEG CRLAQHLVQ K 

LSPENDNDDDEDVQVEVAEKVQKSSAPREMPKAEEKEVPEDSLE 

ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW 

EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 

YSTLS I PPEMLAS YKS YSSTFHSLEEQQVCMAVDIGRHRWDQVK 

KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 

CQPYRSAFYVLEQQRVGLAVNMDE I EKYQEVEEDQDPS CPRLSR 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 

v * j-u\uu v ut\ x r^ssjj^Etd ci tit uyvj f f r* Kijo KtiJjijLiVy EPEVLQD 

SLDRCYSTPSS CLE Q PDSCQ P YGS S FYALEE KHVG FS LD VGE I E 

KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 

DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 

TPSGCLELTDSCQP YRS AFY I LEQQRVGLAVDMDE I EKYQEVE E 

DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 

PYSSAVYSLEEQYLGLALDVDRIKKDQEEEEDQGPPCPRLSREL 

LEWEPEVLQDSLDRCYSTPSSCLEQPDSCQPYGSSFYALEEKH 

VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 

RLNSMLMEVEEPE VLQDSLD I C YS TPSM YFEL PDS FQHYRSVFY 

S FE E EH I S FAL YVDNRF FTLT VTS LHLVFQMG VI FPQ 


6350 


3 


3679 


AGAEKCFVTLLACFLAKQQNKYKYEECKDLIKSMLRNELQFKEE ' 
KLAEQLKQAEELRQYKVLVHSQERELTQLREKLREGRDASRSLN 
EH LQALLT P D EPD KSQGQDLQ E QLAEG CRLAQHLVQ KLS PENDN 
DDDEDVQVEVAEKVQKSSSPREMQKAEEKEVPEDSLEECAITCS 
NSHGPCDSNQPHKNIKITFEEDEVNSTLWDRESSHDECQDALN 
ILPVPGPTSSATNVSMWSAGPLSGEKAAINILEINEKLRPQLA 
EKKQQFRNLKEKCFLTQLACFLANQQNKYKYEECKDL I KFMLRN 
E RQ F KE E KLAEQLKQ AE ELRQ Y KVLVHS Q ER E LTQLR E KLREGR 
DAS RS LNEHLQALLTP D E P DKS QGQDLQ E QLAEG CRLAQHLVQK 
LS PENDNDDDEDVQVE VAE KVQKS SAPREMP KAEEKEVP EDSLE 



491 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 
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nucleotide 
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corresponding 
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Predicted end 
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location 
corresponding 
to first 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W-Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECAITCSNSHGPYDSNQPHRKTKITFEEDKVDSTLIGSSSHVEW ' 

EDAVHIIPENESDDEEEEEKGPVSPRNLQESEEEEVPQESWDEG 

YSTLSIPPEMLASYKSYSSTFHSLEEQQVCMAVDIGRHRWDQVK 

KEDHEATGPRLSRELLDEKGPEVLQDSLDRCYSTPSGCLELTDS 

CQPYRSAFYVLEQQRVGLAVNMDEIEKYQEVEEDQDPSCPRLSR 

ELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQPYSSAVYSLEE 

QYLGIALDVDRIKKDOEEEEDOGPPr , PRT,«?RFr,T 1 T7\/VFDTr'i/T nn 

S LDRCYS T PS S CLEQPDS CQP YGS S F YALE E KHVG FS LDVGE I E 

KKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCPRLSRELLDEKG 

PEVLQDSLDRCYSTPSGCLELTDSCQPYRSAFYILEQQRVGLAV 

DMDEIEKYQEVEEDQDPSCPRLSGELLDEKEPEVLQESLDRCYS 

TPSGCLELTDSCQPYRSAFYILEQQRVGLAVDMDEIEKYQEVEE 

DQDPSCPRLSRELLDEKEPEVLQDSLGRCYSTPSGYLELPDLGQ 

P YSSAVYSLEEQYLGLALDVDR I KKDQEEEEDQGPPCPRLSREL 

LEWEPEVLODSLDRCYSTP^sr , TiFnpn<;pn'Dvr2QCT?vaT cttvij 

VGFSLDVGEIEKKGKGKKRRGRRSKKERRRGRKEGEEDQNPPCP 
RLNSMLMEVEEPEVLQDSLDICYSTPSMYFELPDSFQHYRSVFY 
S FEE EH I S FALYVDNRF FTLTVTS LHLVFQMGVI FPQ 


6351 


1291 


319 


REARRRTERSQLGRMLWEVANGRSLVWGAEAVQALRERLGVGG 
RTVGALPRG PRQNSRLGLPLLLMPEEARLIiAE IGAVTL VS APRP 
DSRHHSLALTSFKROOEE^FOFOqATAaPnppTPPnTTT t t7vttt7 
GQ AAKKQKLEQASGAS S S QE AGS S QAAKEDETSDGQASGEQEEA 
GPSSSQAGPSNGVAPLPRSALLVQLATARPRPVKARPLDWRVQS 
KDWPHAGRPAHELRYSIYRDLWERGFFLSAAGKFGGDFLVYPGD 
PLRFHAHYIAQCWAPEDTIPLQDLVAAGRLGTSVRKTLLLCSPQ 
PDG KWYTS LQ WAS LQ 


6352 


235 


923 


WSEWLSPCHAAKCKGLSMLRITMKTRAISLAADATEFVQGRSAP ' ' 

AMARSLVHDTVFYCLSVYQVKISPTPQLGAASSAEGHVGQGAPG 

LMGNMNPEGGVNHENGMNRDGGMIPEGGGGNQEPRQQPQPPPEE 

PAQAAMEGPQPENMQPRTRRTKFTLLQVEELESVFRHTQYPDVP 

TRRELAENLGVTEDKVRVWFKNKRARCPJ^QRELMU^ELRADP 

DDCVYIWD 


6353 


65 


672 


RFAGAGAIPEARARPPDVQAAEEEKEMDLPDSASRVFCGRILSM 
VNTDDVNAIILAQKNMLDRFEKTNEMLLNFNNLSSARLQQMSER 
FLHHTRTLVEMKRDLDSIFRRIRTLKGKLARQHPEAFSHIPEAS 
FLEEEDEDPIPPSTTTTIATSEQSTGSCDTSPDTVSPSLSPGFE 
DLSHVQPGSPAINGRSQTDDEEMTGE 


6354 


965 


510 


PSLRPMEPTRDCPLFGGAFSAILPMGAIDVSDLRPVPDNQEVFC 
HPVTDQSLIVELLELQAHVRGEAAARYHFEDVGGVQGARAVHVE 
S VQPLSLENLALRGR CQEAWVLSGKQQ I AKENQQVAKDVTLHQA 
LLRLPQYQTDLLLTFNQPP 


*35<T 


158 


1662 


RGSSAAFRGSGLRGAMTRRVT.PHnMnpnT.T TPPPPTPDrrpoT r> — 
WDGKVSE I KKKI KS I LPGRS CDLLQDTSHLPPEHSD WI VGGGV 
LGLSVAYWLKKLESRRGAIRVLWERDHTYSQASTGLSVGGICQ 
QFSLPENIQLSLFSASFLRNINEYLAWDAPPLDLRFNPSGYLL 
LASEKDAAAMESNVKVQRQEGAKVSLMSPDQLRNKFPWINTEGV 
ALAS YGME D EG W FD P WCLLQGLRRKVQ S LG VLF CQGE VTR FVSS 
S QRMLT TDD KAWL KR I HE VHVKMDRS LE YQ P VECA I V I NAAGA 
WSAQIAALAGVGEGPPGTLQGTKLPVEPRKRYVYVWHCPQGPGL 
ETPLVADTSGAYFRREGLGSNYLGGRSPTEQEEPDPANLEVDHD 
FFQDKVW PHLALRVPAFETLK VQS AWAG Y YD YNT FDQNG WG PH 
PLWNMYFATGFSGHGLQQAPGIGRAVAEI'TVLKGRFQTIDLSPF 
L FTRF YLGE KI QENN 1 1 


6356 


354 


633 


TGLTSSCLPLQVMMTKRTKDMGKFSSVTVSTIDEEEEEIEAREV 
ADSYAQNAKVIEKQLERKGMSKRRLQELAELEAKKAKMKGTLID 
NQFK 
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Ammo acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6357 


2 


915 


GLLRNMALLVRVLRNQTS I SQWVP VCSRL I P VS PTQGQGDRALS 
RTSQWPQMSQSQACGGSEQIPGIDIQLNRKYHTTRKLSTTKDSP 
QPVEEKVGAFTKIIBAMGFTGPLKYSKWKIKIAALRMYTSCVEK 
TDFEEFFLRCQMPDTFNSWFLITLLHVWMCLVRMKQEGRSGKYM 
CRI I VHFMWEDVQQRGRVMGVNPY I LKKNMILMTNHFYAAILGY 
DEGILSDDHGLAAALWRTFFNRKCEDPRHLELLVEYVRKQIQYL 
DS MNGEDLLLTGE VS WRPLVE KN PQS I LKPHS PT YNDEGL 


6358 


2009 


1040 


ASDALHSLSAPVLRLS SRSAARPATMTEQAI S FAKDFLAGGIAA 
A I S KTAVAP I ER VKLLLQVQHAS KQ I AADKQ YKG I VDC I VRI PK 
EQGVLS FWRGNLANVI R YFPTQALNFAFKDKYKQI FLGGVDKHT 
QFWRYFAGNLASGGAAGATSLCFVYPLDFARTRLAADVGKSGTB 
RE FRGLGDCLVK I TKSDG I RGL YQGFSVS VQG III YRAAYFGVY 
DTAKGM LP D P KNTH I WS WM I AQT VTAVAG WS Y P FDTVRRRMM 
MQSGRKGADIMYTGTVDCWRKIFRDEGGKAFFKGAWSNVLRGMG 
GAFVLVLYDELKKVI 


6359 


98 


1086 


VCRQEEEKMKEDCLPSSHVPISDSKSIQKSELLGLLKTYNCYHE 
GKSFQLRHREEEGTLIIEGLLNIAWGLRRPIRLQMQDDREQVHL 
PSTSWMPRRPSCPLKEPSPQNGNITAQGPSIQPVHKAESSTDSS 
GPLEEAEEAPQLMRTKSDASCMSQRRPKCRAPGEAQRIRRHRFS 
I NGH F YNHKTS V FTPAYG S VTNVR VNS TMTTLQ VLTLLLN KFRV 
EDGPSEFALYIVHESGERTKLKDCEYPLISRILHGPCEKIARIF 
LMEADLGVEVPHEVAQYI KFEMPVLDSFVEKLKEEEEREI I KLT 
MKFQALRLTMLQRLEQLVEAK 


6360 


1 


345 


GTRGAVPSTLEEWLPPRSCRVFWIHSGTTMSKVSFKITLTSDP 
RLPYKVLSVPESTPFTAVLKFAAEEFKVPAATSAIITNDGIGIN 
PAQTAGNVFLKHGS ELR 1 1 PRDRVGSC 


6361 


615 


158 


RPGLGQLQHCALAPQAGNRRCRFHGRLHALTRSTHRGKPMSIMQ 
FKDTLNTPLPDS S PVAVPLGAP I AVASTLS VEHNDGVETG I WAC 
APGRWRRQITSQEFCHFIQGRCTFTPDDGETLHIQAGDAliMLPA 
NSTG I WDIQETVRKTYVLI L 


6362 


350 


1576 


TTWDGSHSAALKLQQLPPTSSSSAVSEASFSYKENLIGALLAIF 
GHLWSIALNLQKYCHIRLAGSKDPRAYFKTKTWWLGLFLMLLG 
ELG VFAS YAFAPLSLIVPLSAVSVIASAI IGI I FI KEKWKPKDF 
LRR YVL S FVGCGLAWGTYLL VTFAPNSHEKMTGENVTRHLVS W 
PFLLYMLVEIILFCLLLYFYKEKNANNIWILLLVALLGSMTW 
TVKAVAGMLiVLS I QGNLQLD YP I F YVMFVCMVATAVYQAAFLSQ 
ASQMYDSSLIASVGYILSTTIAITAGAIFYLDFIGEDVLHICMF 
ALGCLIAFLGVFLITRNRKKPIPFEPYISMDAMPGMQNMHDKGM 
TVQPELKASFSYGALENNDNISEIYAPATLPVMQEEHGSRSASG 
VPYRVLEHTKKE 


6363 


7.1 


1201 


RRTRLGSSFPRRRDSSAMESYDVlANQPWibNGSGVIKAGFAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLSI 
RYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPL 
NPRKNRERAAEVFFETFNVPALFISMQAVLSLYATGRTTGWLD 
SGDGVTHAVPIYEGFAMPHSIMRIDIAGRDVSRFLRLYLRJCEGY 
DFHSSSEFEIVKAIKERACYIjSINPQKDETLETEKAQYYLPDGS 
TIEIGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
RTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQE 
RLYSTWIGGSILASLDTFKKMWVSKKEYE3DGARSIHRKTF 




21 


1201 


RRTRLGS S F PRR RDS S AM E S YD VI ANQ P WT DNG S GV I KAG FAG 
DQIPKYCFPNYVGRPKHVRVMAGALEGDIFIGPKAEEHRGLLSI 
RYPMEHGIVKDWNDMERIWQYVYSKDQLQTFSEEHPVLLTEAPL 
NPR KNRERAAEVFFET FNVPALFI SMQAVLSLYATGRTTGWLD 
SGDGVTHAVPIYEGFAMPHSIMRIDIAGRDVSRFLRLYLRKEGY 
DFHSSSEFEI VKA I KERAC YL S INPQ KDE TLETE KAQ Y YL P DGS 
TIEIGPSRFRAPELLFRPDLIGEESEGIHEVLVFAIQKSDMDLR 
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Amino acid segment containing signal peptide 
{A=Alanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
PaProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








RTLFSNIVLSGGSTLFKGFGDRLLSEVKKLAPKDVKIRISAPQE 
RLYSTWIGGS ILASLDTFKKMWVS KKEYEEDGARS IHRKTF 


6365 


234 


1989 


KH KS RAS CAARAQAFG P S RE RE VHS R FRSGLRRL& ESN S GCCTM " 
ASMGTLAFDEYGRPFLIIKDQDRKSRLMGLEALKSHIMAAKAVA 
NTMRTS LGPNGL DKMMVDKDGD VT VTNDGAT I LS MMD VDHQ I AK 
LMVELS KSQDDE I GDGTTGVWLAGALLEEAEQLLDRG I HP I RI 
ADG YEQAARVAI EHLDKI S DSVLVD I KDTEPL I QTAKTTLGS KV 
VNSCHRQMAEIAVNAVLTVADMERRDVDFELIKVEGKVGGRLED 
TKLIKGVIVDKDFSHPQMPKKVEDAKIAILTCPFEPPKPKTKHK 
L D VTS VED Y KALQ KYE KE K FEEM IQQ I KETGANLA I CQWG FD D E 
ANHLLLQNNLPAVRWVGGPEIELIAIATGGRIVPRFSELTAEKL 
G FAG LVQE I S FGTTKDKM L VI EQ C KNS RAVT I F I RGGN KM 1 1 E E 
AKRS LHDALC VI RNL I RDNR WYGGGAAE I S CALAVSQEADKCP 
TLEQYAMRAFADALEVIPMALSENSGMNPIQTMTEVRARQVKEM 
NPALG I DCLH KGTNDMKQQHVI ETL I GKKQQ I S LATQMVRM ILK 
IDDIRKPGESEE 


6366 


257 


1898 


GNKEGAHSSTFWVLLSIFLGAVAMLCKEQGITVLGLNAVFDILV^ 

IGKFNVLEIVQKVLHKDKSLENLGMLRNGGLLFRMTLLTSGGAG 

MLYVRWRIMGTGPPAFTEVDNPASFADSMLVRAVNYNYYYSLNA 

WLLLCPWWLCFDWSMGCIPLIKSISDWRVIALAALWFCLIGLIC 

QALC S E DGH KRR I LTLG LG FL V I P F L PAS NLF FRVG F WAERVL 

YLPSVGYCVLLTFGFGALSKHTKKKKLIAAWLGILFINTLRCV 

LRSGEWRSEEQLFRSALSVCPLNAKVHYNIGKNLADKGNQTAAI 

R YYREAVR LN P KY VHAMNNLGN I L KE RNE LQEAE E LLS LA VQ I Q 

PDFAAAWMNLGIVQNSLKRFEAAEQSYRTAIKHRRKYPDCYYNL 

GRL YADLNRH VD ALNAWRNATVLKPEHSLAWNNM 1 1 LLDNTGNL 

AQAEAVGREALELIPNDHSLMFSLANVLGKSQKYKESEALFLKA 

I KANPNAAS YHGNLAVLYHRWGHLDLAKKH YE I S LQLDPTASGT 

KENYGLLRRKLELMQKKAV 


6367 


287 


1934 


SIGFPVMLVLSILLYTCEMFQDSVAFEDVAVSFTQEEWALLDPS 
QKNLYRDVMQETFKNLTSVGKTWKVQNIEDEYKNPRRNLSLMRE 
KLCES KESHHCGES FNQIADDMLNRKTLPGITPCESSVCGEVGT 
GHSSLNTHIRADTGHKSSEYQEYGENPYRNKECKKAFSYLDSFQ 
SHDKACTKEKPYDGKECTETFISHSCIQRHRVMHSGDGPYKCKF 
CG KAF Y FLNLCL I HERIHTGVKP YKCKQCGKAFTRS TTLPVHER 
THTGVNADECKECGNAFSFPS E IRRHKRSHTGEKPYECKQCGKV 
F I S FS S I Q YI IKMTHTGEKP YE CKQCG KAFRCGSHLQKHGRTHTG 
EKPYECRQCGKAFRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCA 
S QLQ I HE RTHSGE KPHEC KE CG KVFK YFS S LRIHERTHTGEKPH 
ECKQCGKAFRYFSSLHIHERTHTGDKPYECKVCGKAFTCSSSIR 
YHE RTHTG E KP Y E CKHCGKAF I S NY I RYHERTHTG EKP YQCKQ C 
GKAF IRASS CREHERTHTINR 


6368 


1 


327 


RPVPAKLNPRSWPRTAGALPLRPPPLTMAVFHDEVEIEDFQYDE 
DSETYFYPCPCGDNFSITKEDLENGEDVATCPSCSLIIKVIYDK 
DQFVCGETVPAPSANKELVKC 


6369 


1 


1745 


AGCCRDTRFPTPRGPGSLCHNFCRSAACTVTRTIHGSPREDTGT 
PRSREMMFQDSVAFEDVAVSFTQEEWALLDPSQKNLYRDVMQET 
FKNLTSVGKTWKVQNIEDEYKNPRRNLSLMREKLCESKESHHCG 
ESFNQIADDMLNRKTLPGITPCESSVCGEVGTGHSSLNTHIRAD 
TGHKSSEYQEYGENPYRNKECKKAFSYLDSFQSHDKACTKEKPY 
DGKECTETFISHSCIQRHRVMHSGDGPYKCKFCGKAFYFLNLCL 
1HERIHTGVKPYKCKQCGKAFTRSTTLPVHERTHTGVNADECKE 
CGNAFSFPSEIRRHKRSHTGEKPYECKQCGKVFISFSSIQYHKM 
THTGEKPYECKQCGKAFRCGSHLQKHGRTHTGEKPYECRQCGKA 
FRCTSDLQRHEKTHTEDKPYGCKQCGKGFRCASQLQIHERTHSG 
EKPHECKECGKVFKYFSSLRIHERTHTGEKPHECKQCGKAFRYF 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline / Q=Glutamine, R=Arginine, 
S^Serine, ^Threonine, V«Valine, 
W=Tryptophan, Y«Tyrosine, X= Unknown, *=Stop 
Codon, /=poseible nucleotide deletion, 
\=possible nucleotide insertion) 








SSLHIHERTHTGDKPYECKVCGKAFTCSSSIRYHERTHTGEKPY 
ECKHCX3KAFISNYIRYHERTHTGEKPYQCKQCGKAFIRASSCRE 
HERTHTINR 


6370 


1711 


329 


FVLSEQRLRTERTWPRSPGLGRGAAAAGARTAGAGLLRLLLGCG 
ALVGGLRPVTMTTPANAQNASKTWELSLYELHRTPQEAIMDGTE 
IAVSPRSLHSELMCPICLDMLKNTMTTKECLHRFCSDCIVTALR 
S GNKEC PTCR KKLVS KRS LRPDPNFDAL I SKI YPSRE E YEAHQD 
R VL I RLS RLHNQQ ALS SSI EEG LRMQ AMKRAQR VRRP IPGS DQT 
TTMSGGEGEPGEGEGDGEDVSSDSAPDSAPGPAPKRPRGGGAGG 
SSVGTGGGGTGGVGGGAGSEDSGDRGGTLGGGTLGPPSPPGAPS 
PPEPGGEIELVFRPHPLLVEKGEYCQTRYVKTTGNATVDHLSKY 
LALRIALERRQQQEAGEPGGPGGGASDTGGPDGCGGEGGGAGGG 
DGPEEPALPSLEGVSEKQYTIYIAPGGGAFTTLNGSLTLELVNE 
KFWKVSRPLELCYAPTKDPK 


6371 


3 


288 


GVANMSTAMNFGTKS FQPRPPDKGS FPLDHLGE CKS FKEKFMKC 
LHNNN FENALCRKE S KE Y L ECRME R KLMLQE P LE KLG FG DLTSG 
KSEAKK 


6372 


2141 


625 


RVSAIASEGKAEERYKKLEDLLEKSFSLVKMPSLQPVVMCVMKH 
LPKVPEKKLKLVMADKELYRACAVEVRRQIWQDNQALFGDEVSP 
LLKQYILEKESALFSTELSVLHNFFSPSPKTRRQGEWQRLTRM 
VGKOTKLYDMVLQFLRTLFLRTRNVHYCTLRAELLMSLHDLDVG 
EICTVDPCHKFTWCLDACIRERFVDSKRARELQGFLDGVKKGQE 
QVLGDLSMILCDPFAINTLALSTVRHLQELVGQETLPRDSPDLL 
LL LRLLALG QGAWDM I D S Q VFKE P KME VEL I TR FL PMLMS FL VD 
DYTFNVDQKLPAEEKAPVSYPNTLPESFTKFLQEQRMACEVGLY 
YVLHITKQRNKNALLRLLPGLVETFGDLAFGDIFLHLLTGNLAL 
LADEFALEDFCSSLFDGFFLTASPRKENVHRHALRLLIHLHPRV 
APSKLEALQKALEPTGQSGEAVKELYSQLGBKLEQLDHRKPSPA 
QAAETPALELPLPSVPAPAPL 


6373 


67 


711 


PSRAARASPARLPAMVSWI ISRLWLI FGTLYPAYYSYKAVKSK 
DI KE YVKWMM YW 1 1 FALFTTAETFTD 1 FLCWFP FYYELKI AFVA 
WLLSPYTKGSSLLYRKFVHPTLSSKEKEIDDCLVQAKDRSYDAL 
VH FGKRGLNVAATAAVMAAS KGQGALS ERLRS FS MQDLTT I RGD 
GAPAPSGPPPPGSGRASGKHGQPKMSRSASESASSSGTA 


6374 


535 


2105 


HKLFCSYISTSEFPSSTRHHSCPTHTFCNYTSSTIFLSSTRDHS 
CPTHTFCNYTSSTIFLSSTRDHSCPTHTSCNYTSSTIFLSSTRD 
HSCPTHTSCNYTSSTIFLSSTRDHSCPTHTFCNYPRPIIRLSSC 
CPAELQTEGSNGKKEVLSGFQWLEDTVLFPEGGGQPDDRGTIN 
D I SVLR VTRRG EQADHFTQT P LDPG S QVL VR VD W ERR FDHMQQH 
SGQHLITAVADHLFKLKTTSWELGRFRSAIELDTPSMTAEQVAA 
I EQSVNEKIRDRLP VNVRELSLDDPEVEQVSGRGLPDDHAGP IR 
WNI EGVDSNMCCGTHVSNLSDLQVI KILGTEKGKKNRTNLI FL 
SGNR VL KWME RS HGTEKALTALL KCGAEDHVEAVKKLQNSTK I L 
QKNNLN LLRDLA VH I AHS LRNS PDWGG W I LHRKEGDS E FMN 1 1 
ANEIGSEETLLFLTVGDEKGGGLFLLAGPPASVETLGPRVAEVL 
EGKGAGKKGRFQGKATKMSRRMEAQALLQDYISTQSAKE 


6375 


1 


1535 


AIMAAATRPVRLPEAGCEGRERCWNPSRSRSHSGEGGIiAAWSRT 
CPGRPRRPGQQWRGPTMLVTAYLAFVGLLASCLGLELSRCRAK 
P PGRACSNP S FLRFQLDFYQVYFLALAADWLQAP YLYKL YQH Y Y 
FLEGQ I AI LYVCGLASTVLFGLVASSLVDWLGRJKNS CVL FSLTY 
SLCCLTKLSQDYFVLLVGRALGGLSTALLFSAFEAWYIHEHVER 
HDFPAEWI PATFARAAFWNHVLAWAGVAAEAVASWIGLGPVAP 
FVAAI PLLALAGALALRNWGENYDRQRAFSRTCAGGLRCLLSDR 
RVLLLGTIQALFESVIFIFVFLWTPVLDPHGAPLGIIFSSFKAA 
SLLGSSLYRIATSKRYHLQPMHLLSLAVLIWFSLFMLTFSTSP 
GQ E S P VE S F I AF LL I ELACGL Y FP S MS FLRR KVI PETEQAGVLN 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








WFRVPIjHSLACIiGLLVUIDSDRKTGTRNMFSICSAVMVMALLAV 
VGLFTWRHDAELRVPSPTEEPYAPEL 


6376 


380 


1437 


ISSTDIDHYRFSFLVNSKMPSKESWSGRKTNRAAVHKSKQEGRQ 
QDLLIAALGMKLGSPKSSVTIWQPLKLFAYSQLTSLVRRATLKE 
NEQIPKYEKIHNFKVHTFRGPHWCEYCANFMWGL1AQGVKCADC 
GLNVHKQCSKMVPNDCKPDLKHVKKVYSCDLTTLVKAHTTKRPM 
WDMCI RE I ESRGLNSEGLYRVSGFSDLIEDVKMAFDRDGEKAD 
I S VNM Y ED I Nil TGAL KL YFRDLP I P L I T YDA YP K F I ES AK I MD 
PDEQLETLHEALKLLPPAHCETLRYLMAHLKRVTLHEKENLMNA 
ENLG I VFG PTLMRS P ELDAMAALND I RYQRLWELL I KNED I L F 


6377 


2311 


1845 


SRIRRRSSRRPREPPGPSRRRRRRRPDPRTMPSEKTFKQRRTFE " 
QRVEDVRL I REQHPTKI PVI I ER YKGEKQLP VLDKTKFLVPDHV 

NMSELIKIIRRRLQLNANQAFFLLVNGHSMVSVSTPISEVYESE 
KDEDGFL YM VYAS QETFGMKLS V 


6378 


686 


191 


GAGPWEAFPDGIGRRSRRARLPQYKRPPGRVdGGDSGRRNMAVA 
DLALIPDVDIDSDGVFKYVLIRVHSAPRSGAPAAESKEIVRGYK 
WAE YHAD I YDKVSGDMQKQGCDCE CLGGGR I SHQSQDKKI HV YG 
Y S MA YGP AQHA I S TE K I KAK Y PD YE VTW ANDG Y 


6379 


35 


378 


ERAGSPSPSRAALRRCAPQRSQAPRWPDRAACRRSFQGSQGRAY 
LFNSWNVGCGPAEERVLLTGLHAVADIYCENCFCTTLGWKYEHA 
FESSQKYKEGKYIIELAHMIKDNGWD 


• 6380 


1414 


462 


PAVQGQRGAGPPTGRGSGNMARFALTWRHGETRFNKEKIIQGQ 
G VD E P LS ETG F KQAAAAG I FLNNVK FTHAFS S DLMRTKQTMHG I 
LERSKFCKDMTVKYDSRLRERKYGWEGKALSELRAMAKAAREE 
CPVFTPPGGETLDQVKMRGIDFFEFLCQLILKEADQKEQFSQGS 
PSNCLETSLAEIFPLGKNHSSKVNSDSGIPGLAASVLWSHGAY 
MRSLFDYFLTDLKCSLPATLSRSELMSVTPNTGMSLFI INFEEG 
RE VKP T VQ C I CMNLQDHLNG LTENS LGLNLP S KSNH FE P L KG VP 
LALFTSLLC 


6381 


1668 


218 


A WRAQGS RG FS GAGWR PRQAAAMN F S E V FKLS S LLCKFS PDG K 
YLASCVQYRIiWRDVNTLQILQLYTCLDQIQHIEWSADSLFILC 
AMY KRG LVQ VW S LEQ P E WHCKIDEG S AGLVAS C WS PDGRH I LNT 
TEFHLRITVWSLCTKSVSYIKYPKACLQGITFTRDGRYMALAER 
RDCKDYVS I FVCSD WQLLRHFDTDTQDLTGIE WAPNGCVLAVWD 
TCLEYKILLYSLDGRLLSTYSAYEWSLGIKSVAWSPSSQFLAVG 
S YDG KVR I LNHVTWKM I TE FGHPAA INDPKIWY KEAE KS PQ LG 
LGCLS F P PPRAGAG PL PSS ES KYE I AS VP VSLQTL KP VTDRANP 
KIG IGMbAFS PDS YFLATRNDN1 PNAVWVWD I QKLRLFAVLEQL 
SPVRAFQWDPQQPRLAICTGGSRLYLWSPAGCMSVQVPGEGDFA 
VLSLCWHLSGDSMALLSKDHFCLCFLETEAWGTACRQLGGHT 


6382 


2 


1062 


F E EDE DRNLCL I AY P LKGDHG I VD I VDNS DCE P KS KLLRWTTNK " 

KHHVLETEKTPKDWVRQHRKEEKMKSHKLEEEFEWLKKSEVLYY 

TVEKKGNISSQLKHYNPWSMKCHQQQLQRMKENAKHRNQYKFIL 

LENLTSRYEVPCVLDLKMGTRQHGDDASEEKAANQIRKCQQSTS 

AV I GVR VCGMQ V YQ AGSGQLM FMNK YHGRKLS VQG F KEALFQFF 

HNGRYLRRELLGPVLKKLTELKAVLERQESYRFYSSSIiLVIYDG 

KERPEWLDSDAEDLEDLSEESADESAGAYAYKPIGASSVDVRM 

IDFAHTTCRLYGEDTWHEGQDAGYIFGLQSLIDIVTEISEESG 
E 


6383 


3159 


1061 


S PAPGR PS P HGSQ P AARAAAAP AM PS AKQRG S KGGHGAAS PS EK 
G AH PS AAR P LAAP T PAAP ACRS P S PGGAP AS FPGRAPRS LASQ P 
AARAAAAPAMPSAKQRGSKGGHGAASPSEKGAHPSGGADDVAKK 
PPPAPQQPPPPPAPHPQQHPQQHPQNQAHGKGGHRGGGGGGGKS 
S SSSS ASAAAAAAAASS SAS CSRRLGRALNFLFYLALVAAAAFS 
GWCVHHVLEEVQQVRRSHQDFSRQREELGQGLQGVEQKVQSLQA 
TFGTFESILRSSQHKQDLTEKAVKQGESEVSRISEVLQKLQNEI 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








LKDLSDG IHWKDARERDFTSLENTVEERLTELTKS INDNI AI F"" 

TEVQKRSQKEINDMKAKVASLEESEGNKQDLKALKEAVKEIQTS 

AKSRE WDMEALRS TLQTMESDI YTEVREL VS LKQEQQAFKEAAD 

TERLALQALTEKLLRSEESVSRLPEEIRRLEEELRQLKSDSHGP 

KEDGGFRHSEAFEALQQKSQGLDSRLQHVEDGVLSMQVASARQT 

ESLESLLSKSQEHEQRLAALQGRLEGLGSSEADQDGLASTVRSL 

GETQLVLYGDVEELKRSVGELPSTVESLQKVQEQVHTLLSQDQA 

QAARLP PQD FLDR LS S LDNLKAS VSQVE ADLKMLRTAVDSLVAY 

SVKIETNENNLESAKGLLDDLRNDLDRLFVKVEKIHEKV 


6384 


738 


1904 


IWEVPVCLTHLtHLQQANQPLPPPSSSINEEDADEANRAIGEKR 
AAPDSGKKPKTPKTKQQKDPNEPQKPVSAYALFFRDTQAAIKGQ 
NPNATFGEVSQIVASMWDSLGEEQKQVYKRKTEAAKKEYLKALA 
AYRASLVSKAAAESAEAQTIRSVQQTLASTNLTSSLLLNTPLSQ 
HGTVSASPQTLQQSLPRSIAPKPLTMRLPMWQIVTSVTIAANMP 
SNIGAPLISSMGTTMVGSAPSTQVSPSVQTQQHQMQLQQQQQQQ 
QQQMQQMQQQQLQQHQMHQQIQQQMQQQHFQHHMCX2HLQQQQQH 
IjQQQINQQQLQQQLQQRLQLQQLQHMQHQSQPSPRQHSPVASQI 
TSPIPAIGSPQPASQQHQSQIQSQTQTQVLSQVSIF 


6385 


2 


1584 


LAQGPDAATTDELSSLGSDSEANGFAERRIDKFGFIVGSQGAEG 
ALEEVPLEVLRQRESKWLDMLNNWDKWMAKKHKKIRLRCQKGIP 
PSLRGRAWQYLSGGKVKLQQNPGKFDELDMSPGDPKWLDVIERD 
LHRQFPFHEMFVSRGGHGQQDLFRVLKAYTLYRPEEGYCQAQAP 
IAAVLLMHMPAEQAFWCLVOICEKYLPGYY c ?PK'T,FJXTnT,nr , PTT 

FSLLQKVSPVAHKHLSRQKIDPLLYMTEWFMCAFSRTLPWSSVL 
RVWDM FFCEGVKI I FRVGLVLLKHALGS PEKVKACQGQ YET I ER 
LRSLSPKIMQEAFLVQEWELPVTERQIEREHLIQLRRWQETRG 
ELQCRSPPRLHGAKAILDAEPGPRPALQPSPSIRLPLDAPLPGS 
KAKP KP PKQAQKEQRKQMKGRGQLE KP PAPNQAM WAAAGDACP 
PQHVPPKDSAPKDSAPQDliAPQVSAHHRSQESLTSQESEDTYL 


63B6 


819 


195 


TVCGS FYLG IMQRASRLKRELHMLATEPPPGI TCWQDKDQMDDL 
RAQILGGANTPYEKGVFKLEVIIPERYPFEPPQIRFLTPIYHPN 
IDSAGRICLDVLKLPPKGAWRPSLNIATVLTSIQLLMSEPNPDD 
PLMADISSEFKYNKPAFLKNARQWTEKHARQKQKADEEEMIjDNL 
PEAGDSRVHNSTQKRKASQLVGIEKKFHPDV 


*387 


1 


662 


PG P THAS ADAWADAWAQ PNMAMHNKAAP PQ I P DTRRELAE L V KR 
KQELAETLANLERQIYAFEGSYLEDTQMYGNIIRGWDRYLTNQK 
NSNSKNDRRNRKFKEAERLFSKSSVTSAAAVSALAGVQDQLIEK 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLNFCKPRADY 


6388 


1 


662 


PGPTHASADAWADAWAQPNMAMHNKAAPPQI PDTRRELAELVKR 
KQEIiAETLANLERQIYAFEGSYLEDTQMYGNIIRGWDRYLTNQK 
NS NS KNDRRNRKFKEAERLFS KS S VTS AAAVSALAG VQ DQL I E K 
REPGSGTESDTSPDFHNQENEPSQEDPEDLDGSVQGVKPQKAAS 
STSSGSHHSSHKKRKNKNRHSPSGMFDYDFEIDLKLNKKPRADY 


6389 


1074 


4$7' 


AEPGDRMAGHRLVLVLGDLH t PHRCNSLPAKFKKLLVPGKI QH I 
LCTGNLCTKESYDYLKTLAGDVHIVRGDFDENLNYPEQKVVTVG 
QFKIGLIHGHQVIPWGDMASIALLQRQFDVDILISGHTHKFEAF 
EHENKFYINPGSATGAYNALETNIIPSFVLMDIQASTWTYVYQ 
LIGDDVKVERIEYKKP 


6390 


158 


535 | 


GEERKEGRAPGKAFAPERNPAKMEKEETTRELLLPNWQGSGSHG 
LTIAQRDDGVFVQEVTQNSPAARTGWKEGDQIVGATIYFDNLQ 
S G E VTQLLN TMG HHT VG LKLHRKGDR F F P S LGQTWDP 


6391 


5386 


2897 


VRWNS KTE C YLS I QTQ ENFPANLNE LVNC I V I S SLVTTQRKLKA 
MSLLGSRNQLARAVLNPNPMDFCTKDLLTTTSERIIAYLRDFNE 
DQKKAIETAYAMVKHSPSVAKICLIHGPPGTGKSKTIVGLLYRL 
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Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LTENQRKGHSDENSNAKIKQNRVLVCAPSNAAVDELMKKIILEF 
KE KCKDK KN PLGNCGD I NL VRLG PEKS I NS E VLK FS L DS Q VNHR 
MKKELPSHVQAMHKRKEFLDYQLDELSRQRALCRGGREIQRQEL 
DENIS KVSKERQELASKIKEVQGRPQKTQS 1 1 ILESHI ICCTLS 
TSGGLLLESAFRGQGGVPFSCVIVDEAGQSCEIETLTPLIHRCN 
KLILVGDPKQLPPTVISMKAQEYGYDQSMMARFCRLLEENVEHN 
MISRLPILQLTVQYRMHPDICLFPSNYVYNRNLKTNRQTEAIRC 
SSDWPFQPYLVFDVGDGSERRDNDSYINVQEIKLVMEIIKLIKD 
KR KDVS FRNIG 1 1 TH YKAQKTM IQKDLDKE FDRKG PAE VDT VDA 
FQGRQKDCVIVTCVRANSIQGSIGFLASLQRLNVTITRAKYSLF 
ILGHLRTLMENQHWNQLIQDAQKRGAIIKTCDKNYRHDAVKILK 
LKPVLQRSLTHPPTIAPEGSRPQGGLPSSKLDSGFAKTSVAASL 
YHTPSDS KE ITLTVTS KDPERP P VHDQLQDPRLLKRMG I EVKGG 
IFLWDPQPSSPQHPGATPPTGEPGFPWHQDLSHVQQPAAWAA 
LSSHKPPVRGEPPAASPEASTCQSKCDDPEEELCHRREARAFSE 
GEQEKCGSETHHTRRNSRWDKRTLEQEDSSSKKRKLL 


6392 


972 


186 


GRTGVDLASSMAHRLQIRLLTWDVKDTLLRLRHPLGEAYATKAR 
AHG LE VE P S AL EQG FRQAYRAQ S H S FPN YG LS HGLTS RQ WWLD V 
VLQTFHLAGVQDAQAVAPIAEQLYKDFSHPCTWQVLDGAEDTLR 
E CR TRGLRLA VI SNFDRRLEG I LGGLGLREHFDF VLTS EAAGW P 
KPDPRIFQEALRLAHMEPWAAHVGDNYLCDYQGPRAVGMHSFL 
WGPQALDPWRDSVPKEHILPSLAHLLPALDCLEGSTPGL 


6393 


2017 


730 


TGG S KMAAVAT CGS VAAS TGS AVATAS KSNVTS FQRRG PRAS VT 
NDSGPRLVS IAGTRPSVRNGQLLVSTGLPALDQLLGGGLAVGTV 
LLI EEDKYNI YS PLLFKYFLAEGI VNGHTLLVASAKEDPANILQ 
ELPAPLLDDKCKKEFDEDVYNHKTPESNIKMKIAWRYQLLPKME 
IGPVSSSRFGHYYDASKRMPQELIEASNWHGFFLPEK1SSTLKV 
E PCS LTPG YTKLLQF I QN 1 1 YEEG FDGSNPQKKQRNI LR IG I QN 
LGS PLWGDD I CCAENGGNSHSLTKFLY VLRGLLRTSLSAC I ITM 
PTHLIQNKAIIARVTTIiSDWVGLESFIGSERETNPLYKDYHGL 
IHIRQIPRLNNLICDESDVKDLAFKLKRKLFTIERLHLPPDLSD 
TVSRSSKMDLAESAKRLGPGCGMMAGGKKHLDF 


" 6394 


1418 


511 


GAAAGGEGARRRPAAMATVMAATAXfiRAVLfifiE^RWLLHDEVHA - 

VLKQLQDILKEASLRFTLPGSGTEGPAKQENFILGSCGTDQVKG 

VLTLQGDALSQADVNLKMPRNNQLLHFAFREDKQWKLQQIQDAR 

NHVSQAIYLLTSRDQSYQFKTG7U3VLKLMDAVMLQLTRARNRLT 

TPATLTLPEIAASGLTRMFAPALPSDLLVNVYINLNKLCLTVYQ 

LHALQPNSTKNFRPAGGAVLHSPGAMFEWGSQRLEVSHVHKVEC 

VIPWLNDALVYFTVSLQLCQQLKDKISVFSSYJtfSYRPF 


6395 


13 


658 


PSGRPTRPLCCAARRGAARHGGSVSGWPAGRTPTETSNPGSSVM 
ESVTFEDVAVEFIQEWALLDSARRSLCKYRMLDQCRTLASRGTP 
PCKPSCVSQLGQRAEPKATERGILRATGVAWESQLKPEELPSMQ 
DLLEEASSRDMQMGPGLFLRMQLVPSIEERETPLTREDRPALQE 
PPWSLGCTGLPCAAMQIQRWI PVPTLGHRNPWVARDSGE 


6396 


1 


1221 


ANILS S PS KRGQKGTLIG YS P EGTP L YNFMG DAFQH S S QS I PRF 
I KES L KQ I LEES DS RQ I F Y FL CLNLL F TFVE LF YGVLTNS LGL I 
SDGFHMLFDCSALVMGLFAALMSRWKATRIFSYGYGRIEILSGF 
INGLFLIVIAFFVFMESVARLIDPPELDTHMLTPVSVGGLIVNL 
IGICAFSHAHSHAHGASQGSCHSSDHSHSHHMHGHSDHGHGHSH 
GSAGGGMNANMRGVFLHVLADTLGSIGVIVSTVLIEQFGWFIAD 
PLCSLFIAILIFLSWPLIKDACQVLLLRLPPEYEKELHIALEK 
IQKIEGLISYRDPHFWRHSAS IVAGTIHIQVTSDVLEQRIVQQV 
TG I LKDAGVNNLT I QVE KEAY FQHMSGLS TGFHDVLAMTKQMES 
MKYCKDGTYIM 


6397 


391 


122 


GAGGVGRFE AIRAPARMI E WCNDRLG KKVRVKCNTDDTIGDLK 
KLIAAQTGTRWNKIVLKKWYTIFKDHVSLGDYEIHDGMNLELYY 
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Glutamic Acid, F=Phenylalanine, G=Glycine, 
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Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








Q 


6398 


353 


1306 


HKQMGPLINRCKKILLPTTVPPATMRIWLLGGLLPPLLLLSGLQ 
RPTEGSEVAIKIDFDFAPGSFDDQYQGCSKQVMEKLTQGDYFTK 
DIEAQKNYFRMWQKAHLAWI^QGKVLPQNMTTTHAVAILFYTLN 
SNVHSDFTRAMASV7VRTPQQYERSFHFKYLHYYLTSAIQLLRKD 
SIMENGTLCYEVHYRTKDVHFNAYTGATIRFGQFLSTSLLKEEA 
QEFGNQTLFTIFTCLGAPVQYFSLKKEVLIPPYELFKVINMSYH 
PRGDWLQLRSTGNLSTYNCQLLKASSKKCIPDPIAIASLSFLTS 
VIIFSKSRV 


6399 


75 


1245 


PNLETYFGRRCEKDSMNFTPTHTPVCRKRTWSKRGVAVSGPTK 
RRGMADSLESTPLPSPEDRLAKLHPSKELLEYYQKXMAECEAEN 
EDLLKKLELYKEACEGQHKLECDLQQREEEIAELQKALSDMQVC 
LFQEREHVLRLYSENDRLRIRELEDKKKIQNLLALVGTDAGEVT 
YFCKEPPHKVTILQKTIQAVGECEQSESSAFKADPKISKRRPSR 
ERKESSEHYQRDIQTLILQVEALQAQLGEQTKLSREQIEGLIED 
RRIHLEEIQVQHQRNQNKIKELTKNLHHTQELLYESTKDFLQLR 
SENQNKEKSWMLE KDNLMSKI KQ YRVQCKKKEDK I GKVLPVMHE 
SHHAQSEYIKVMSLCRNEWYFSGRVEGIPKNLQFVM 


6400 


2520 


1053 


KTMKCDE WYE VQS A I LRHNCG YAM KTG KFFHNLMERKD FETWL 
DNISVTFLSLTDLQKNETLDHLISLSGAVQLRHLSNNLETLLKR 
DFLKLLPLELS F Y LLKWLDPQTLLTC C LVS KQ WNKV I S ACT E VW 
QTACKNLGWQIDDSVQDALHWKKVYLKAILRMKQLEDHEAFETS 
S LIGHS AR VYAL Y Y KDGLL CTG S DDL S AKLWD VS TGQ C VYG I QT 
HTCAAVKF DEQKL VTG S FDNTVACW E W S S GARTQHFRGHTG AV F 
SVDYNDELDILVSGSADFTVKVWALSAGTCLNTLTGHTEWVTKV 
VLQKCKVKSLLHSPGDYILLSADKYEIKIWPIGREINCKCLKTL 
SVSEDRSICLQPRLHFDGKYIVCSSALGLYQWDFASYDILRVIK 
TPEIANLALLGFGDIFALLFDNRYLYIMDLRTESLISRWPLPEY 
RKSKRGSS FLAGEAS WLNGLDGHNDTGLVFATSM PDHS IHLVLW 
KEHG 


6401 


109 


766 


PGAAWSRPDLRGCCTGPQPALRMLVLPSPCPQPLAFSSVETMEG 
PPRRTCRSPEPGPSSSIGSPQASSPPRPNHYLLIDTQGVPYTVL 
VDEESQREPGASGAPGQKKCYSCPVCSRVFEYMSYLQRHSITHS 
EVKPFECDICGKAFKRASHLARHHS IHLAGGGRPHGCPLCPRRF 
RDAGELAQHSRVHSGERPFQCPHCPRRFMEQNTLQKHTRWKHP 


6402 


1196 


279 


TTSQCGGIRQSSAIPVASMEFAAICLRNALLLLPEEQQDPKQEN 
GAKNSNQLGGNTESSESSETCSSKSHDGDKFIPAPPSSPLRKQE 
LENLKCS I LACSAYVALALGDNLMALNHADKLLQQPKLSGSLKF 
LGHLYAAEALISLDRISDAITHLNPENVTDVSLGISSNEQDQGS 
DKGENEAM ESS G KRAPQ C Y PS S VNS ARTVML FNLGS A YCLRS E Y 
DKARKCLHQAASMIHPKEVPPEAILLAVYLELQNGNTQLALQI I 
KRNQLLPAVKTHSEVRKKPVFQPVHPIQPIQMPAFTTVQRK 


6403 


2 


1690 


RGIHTSVLQGNLQNQMYSHNWIMNLNNLNLTQVQQRNLITNLQ 

AAI^SAIAKANNDTLEDMNSQLNSFTGQMENITTISQANEQNLK 
DLQDLHKDAENRTA I KFNQLEERFQLFETD I VNI I SNI S YTAHH 
L RTLTS NLNE VRTTCTDTLTKHTDDLTS LNNTLANI RLD S VS LR 
MQQDLMRSRLDTEVANLSVIMEEMKLVDSKHGQLIKNFTILQGP 
PGPRGPRGDRGSQGPPGPTGNKGQKGEKGEPGPPGPAGERGPIG 
PAGPPGERGGKGSKGSQGPKGSRGSPGKPGPQGPSGDPGPPGPP 
GKEGLPGPQGPPGFQGLQGTVGEPGVPGPRGLPGLPGVPGMPGP 
KGPPGPPGPSGAWPLALQNTEPTPAPEDNSCPPHWKNFTDKCYY 
FSVEKEIFEDAKLFCEDKSSHLVFINTREEQQWIKKQMVGRESH 
WIG LT DS ERENE W KWLDGTS PD Y KNW KAGQ PDN WGHGHG PGE D C 
AGL I YAGQWNDFQCEDVNNFI CEKDRET VLSSAL 


6404 


1012 


222 


AAALAMAAPAPGLISVFSSSQELGAAIiAQLVAQRAACCLiAGARA 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F=Phenylalanine, G=Glycine, 

L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








RFALGLSGGSLVSMliARELPAAVAPAGPASLARWTLGFCDERLV 
PFDHAESTYGLYRTHLLSRLPIPESQVITINPELPVEEAAEDYA 
KKLRQAFQGDSIPVFDLLILGVGPDGHTCSLFPDHPLLQEREKI 
VAPISDSPKPPPQRVTLTLPVLNAARTVIFVATGBGKAAVLKRI 
LEDQEENPLPAALVQPHTGKLCWFLDEAAARLLTVPFEKHSPL 


6405 




1456 


JMMjf Kf i rKH^Jj^KbeTGSDSEMAASMFYGRLVAVATLRNHRPR 
TAQRAAAQVLGSSGLFNNHGLQVQQQQQRNLSLHEYMSMELLQE 
AGVS VPKGYVAKS PDEAYAIAKKLGSKDWI KAQ V LAGGRG KGT 
FESGLKGGVKIVFSPEEAKAVSSQMIGKKLFTKQTGEKGRICNQ 
VLVCERKYPRREYYFAITMERSFQGPVLIGSSHGGVNIEDVAAE 
T P EAI IKEPIDIEEGI KKEQALQLAQKMG F PPNI VE S AAENMVK 
LYSLFLKYDATMIEINPMVEDSDGAVLCMDAKINFDSNSAYRQK 
r ULAjLiYi i yc.UliKL>ivUAf\KAN JjN Y IGIjDGNIGCLVNGAGLiAMA 
TMDIIKLHGGTPANFLDVGGGATVHQVTEAFKLITSDKKVLAIL 
VNIFGGIMRCDVIAQGIVMAVKDLEIKIPVWRLQGTRVDDAKA 
iJA/wooiJi\.±ii/iLXJUljUl!.A/\KM v V Jvliofi J. V 1 JjAK.yAH VDVKF QLP 
I 


6406 


1036 


167 


HPRQMRGEDTPEAPPYSSGRYDS IKTEVSGCPEDLTVGRAPTAD " 
DDDDDHDDHEDNDKMNDSEGMDPERLKAFNMFVRLFVDENLDRM 
VPISKQPKEKIQAIIESCSRQFPEFQERARKRIRTYLKSCRRMK 
lUNorJJCirJ iitr 1 r JrflJU JL oAPUUiJNl lliAAAL-JabJl I RKAAKRMRLEI YQ 
SSQDEPIALDKQHSRDSAAITHSTYSLPASSYSQDPVYANGGLN 
YS YRG YGAL S SNLQ P PAS LQTGNHSNGE S GE ARALAS R PAPS WV 
\~ ivvum o \jn vj k\3 ivy t\ if v n n» Ko v_ Jj l A 


6407 


492 


150 


VGLCLAVSQTVLAQLDALLVFPGQVAQLS CTLS PQHVT I RDYG V 
SWYQQRAGSAPRYLLYYRSEEDHHRPADIPDRFSAAKDEAHNAC 
VLTISPVQPEDDADYYCSVGYGFSP 


6408 


1458 


903 


RUV ' 1 ii3c>vAvyK.ijroov IXXjB WrlKl JLi\UirLb«rr'X xPGnGMMFVR 
NDCKVFRFCKSKCHKNFKKKRNPRKVRWTKAFRKAAGKELTVDN 
S FEFEKRRNEP I KYQRELWNKTI DAMKRVEE I KQKRQAKFIMNR 
LKKNKELQKVQDIKEVKQNIHLIRAPLAGKGKQLEEKMVQQLQE 
DVDMEDAP 


6409 


150 


446 


NTALANLLRCFTCDRLCGGCTAPAPPAHQGIVLQPVMPSCDPGP 
GPACLPTKTFRSYLPRCHRTYSCVHCRAHLAKHDELISKSFQGS 
HGRAYLFNSV 


6410 


85 


607 


RGGTAGCVACLGCWGQSSSPKAAFPAGSACLPADSCPCLLFQAC " 
AISGLFNCITIHPLNIAAGVWMIMNAFILLLCEAPFCCQFIEFA 
NTVAE KVDRLRSWQKAVF YCGMAVVP I VI SLTLTTLLGNAIAFA 
TGVLYGLSALGKKGDAISYARIQQQRQQADEEKLAETLEGEL 


6411 


302 


772 


RLSIMASSLNEDPEGSRITYVKGDLFACPKTDSLAHCISEDCRM 
GAGIAVLFKKKFGGVQELLNQQKKSGEVAVLKRDGRYIYYLITK 
KRASHKPTYENLQKSLEAMKSHCLKNGVTDLSMPRIGCGLDRLQ 
WENVS AM I E E V FEATD I K I T VYTL 


6412 


61 


1709 


RPVTSFSPLPGSCGGRLGTRTMLGRSLREVSAALKQGQITPTEL - 
COKCLS LI KKTKFLNAY I TVS EEVALKOAEE RKP Yin^noci T ,«n 

IiDGIPIAVKDNFSTSGIETTCASNMLKGYIPPYNATWQKLLDQ 
GALLMGKTNLDEFAMGSGSTDGVFGPVKNPWSYSKQYREKRKQN 
PHSENEDSDWLITGGSSGGSAAAVSAFTCYAALGSDTGGSTRNP 
AAHCGLVGFKPSYGLVSRHGLiIPLVNSMDVPGIIiTRCVDDAAIV 
LiGALAGPDPRDSTTVHEPINKPFMLPSLADVSKLCIGIPKEYLV 
PELSSEVQSLWSKAADLFESEGAKVIEVSLPHTSYSIVCYHVLC 
TSEVASNMARFDGLQ YGHRCD I DVSTEAMYAATRREGFNDWRG 
R I LS GNFFLL KENYENY F VKAQ KVRRL IANDFVNAFNSG VD VLL 
TPTTLSEAVP YLEFI KEDNRTRSAQDD I FTQAVNMAGLPAVS I P 
VALSNOGLPIGLQFIGRAFCDQQLLTVAKWFEKQVQFPVIQLQE 
LMDDCSAVLENEKLASVSLKQ 
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Amino acid segment containing signal peptide 
<A«=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine; V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Un)cnown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6413 


2 


885 


HEPRCAGMAASLWMGDLEPYMDENFISRAFATMGETVM<5VKTTP 
N RLTG I P AG YC F VE FAD LATAE KCLH K I NGKP L PGAT P AKR F KL 
NYATYGKQPDNSPEYSLFVGDLTPDVDDGMLYEFFVKVYPSCRG 
G KWLD QTG VS KG YG F VKFTDEL EQKRALTE CQGAVGLGS KP VR 
LSVAIPKASRVKPVEYSQMYSYSYNQYYQQYQNYYAQWGYDQNT 
GSYSYSYPQYGYTQSTMQTYEEVGDDALEDPMPQLDVTEANKEF 
MEQSEELYDALMDCHWQPLDTVSSEIPAMM 


6414 


1 


538 


RGGRAALLPWRRFPCCRPRPQPARPSSRATPGPRSPGMATSltiV 
SFSVGDGVPEAEKNAGEPENTYILRPVFQQRFRPSWKDCIHAV 
L KE E LANAE YS PE EM PO LTKH T . <? FN T KT) T . VT? Mr* pnu v VM\nrri\r 

VIGEQRGEGVFMASRCFWDADTDNYTHDVFMNDSLFCWAAFGC 
FYY 


6415 


2 


1168 


FVRQWQSSHRRACGLGCEARAGGGEEPRGRASSVAGWVGAFRAP 
F X E AAVAGLGAGSGKRRRGWKMPVHSRGDKKETNHHDEME VDYA 
ENEGSSSEDEDTESSSVSEDGDSSEMDDEDCERRRMECLDEMSN 
LEKQFTDLKDQLYKERLSQVDAKLQEVIAGKAPEYLEPLATLQE 
NMQ I RTKVAG I YRELCLES VKNKYECE I QAS RQHCESEKLLL YD 

TVQSELEEKIRRLEEDRHSIDITSELWNDELQSRKKRKDPFWPD 
KKKPGWSGPYIVYMLOnT.rJTTiFnWTTTPViMa'PT/^DUDi/VTco 

PVKLEKHLHSARSEEGRLYYDGEWYIRGQTICIDKKDECPTSAV 
ITTINHDEVWFKRPDGSKSKLYISQLQKGKYSIECHS 


6416 


410 


1519 


EIAPADLEIPACAPVLLSRATSSTMSVTGGKMAPSLTQEILSHL 
GLASKTAAWGTLGTLRTFLNFSVDKDAQRLLRAITGQGVDRSAI 
VDVLTNRSREQRQLISRNFQERTQQDLMKSLQAALSGNLERIVM 
ALLQPTAQFDAQELRTALKASDSAVDVAIEILATRTPPQLQECL 
*\v i Ann r v v c*/\v laj j. ioLi jjyuijijLu/VLAK.o(jRlJS YSG X XDY 
NLAEQDVQALQRAEGPSREETWVPVFTQRNPEHLIRVFDQYQRS 
TGQELEEAVQNRFHGDAQVALLGLASVIKNTPLYF/VDKLHQALQ 
ETEPNYQVLIRILISRCETDLLSIRAEFRKKFGKSLYSSLQDAV 
KGDCQSALLALCRAEDM 


6417 


1 


845 


RGESRVLWSELEGEAGGAGGWAS ( ?LMAPMnMT?T5 , ATAT7UT amn c 
L I S T I YMAAS IGTDFW YE YRS P VQENSSDLNKS I WDEFI SDEAD 
EKTYNDALFRYNGTVGLWRRCITIPKNMHWYSPPERTESFDWT 
KCVSFTLTEQFMEKFVDPGNHNSGIDLLRTYLWRCQFLLPFVSL 
GLMCFGAL IGLCACICRSLYPTIATG I LHLLAGLCTLGS VSCYV 
AG I ELLHQKLELPDNVSGE FGWS FCLACVSAPLQFMASALF I WA 
AHTNRKEYTLMKAYRVA 


6418 


2 


662 


TRTRPRRPPGLGAAVGKAGARSTSTPAGASPAAAYQADPPPPAH 
TPAPPPPPPCGGTACHGT'lPAK'VYrJYnMT.OPnDTP'r'pnnwaT?! \rr\ 

YPDCKSSSGNIGEDPDHLNQSSSPSQMFPWMRPQAAPGRRRGRQ 
TYSRFQTLELEKEFLFNPYLTRKRRIEVSHALAliTERQVKIWFQ 
NRRMKWKKENNKDKFPVSRQEVKDGETKKEAQELEEDRAEGLTN 


6419 


1 j 


973 


PGRPRVRNFDLNSKSILQEFFCTRS IQI PANRSKTAMSKCP t FP 
MARS I STSG PTiDKEDTGRQKLISTGSLPATLQGATDSLGLEWHL 
PSPDPVTVPYLSPLWWKELESLLENEGDHAITVADFVDHHPIV 
FWNL VW YFRRL DL P SNL PGL ILSSEHCN KYS KIP RHCMS EDS K Y 
VL I QMLWDNMKLHQDPGQPL Y I LWNAHTQKYPMVHLLQKSDNS F 
NQELLKSMVKSIKI^NDVYGPMSQILETLNKCPHFKRQRSLYREI 
LFLS L VALGREN I D I DAFDKE Y KMA YDRLTPSQ VKSTHNCDR P P 
STGVMECRKTFGEPYL 


6420 


207 


1187 


RKMIDKNQTCGVGQDSVPYMICLIHILEEWFGVEQLEDYLNFAN 
YLLWVFTPLILLILPYFTIFLLYLTIIFLHIYKRKNVLKEAYSH 
NLWDGARKTVATLWDGHAAVWHGYEVHGMEKIPEDGPALIIFYH 
GAI P I DF YYFMAKI FI HKGRTCR WADHFVFKI PGFSLLLDVFC 
ALHGPREKCVEILRSGHLLAISPGGVREALISDETYNIVWGHRR 
GFAQVAIDAKVPI I PMFTQNIREGFRSLGGTRLFRWLYEKFRYP 
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P=Proline, Q=Glutamine, R«=Arginine, 
S^Serine, T=Threonine, V~Valine, 
W«Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FAPMYGGFPVKLRTYLGDP I PYDPQITAEELAEKTKNAVQALID 
KHQRI PGNI MSALLER FH 


6421 


• 1844 


362 


WALSLRkOPERMSNKLLS PH PH QWI.R s p p v-mzv QQDavrpaQor — 
YQWSLKS SAQ FLGS PQLRQVGQ I IRVPARMAATL I LE PAGRCCW 
DEPVRIAVRGLAPEQPVTLRASLRDEKGAIjFQAHARYRADTLGE 
LDLERAPALGGS FAGLE PMGLLWALE PE KPL VRLVKRD VRT P LA 
VELEVLDGHDPDPGRLLCQTRHERYFLPPGVRREPVRVGRVRGT 
LFLPPEPGPFPGIVDMFGTGGGLLEYRASLIAGKGFAVMALAYY 
NYEDLPKTMETLHLEYFEEAMNYLLSHPEVKGPGVGLLGISKGG 
ELCLSMASFLKGITAAWINGSVANVGGTLRYKGETLPPVGVNR 
NRI KVTKDG YAD I VD VLNSPLEGPDQKS FI PVERAESTFLFLVG 
QDPHNWKSEFYANBACKRLQAHGRRKPQIICYPETGHYIEPPYF 
PL CRAS LHAL VGS P 1 1 WGGE P RAHAMAQ VDAW KQ LQT F FHKHLG 
GREGTIPSKV 


6422 


181 


2133 


EGENLSWFQEFWGDIAKEFYWKTPCPGPFLRYNFDVTKGKIFIE 
WM KG ATTN I C YNVLDRNVHE KKLGD KVAFYWEGNE PG ETTQ I T Y 
HQLLVQ VCQ FS NVLRKQG I HKGDR VA I YM PM I P EL WAM LACAR 
IGALHSIVFAGFSSESLCRRTT.nQQPcjT.T.TTTnnPvon-cvT ttvtt 

KELADEALQKCQEKGFPVRCCIWKHLGRAELGMGDSTSQSPPI 
KRSCPDVQISWNQGIDLWWHELMQEAGDECEPEWCDAEDPLFIL 
YTSGSTGKPKGWHTVGGYMLYVATTFKYVFDFHAEDVFWCTAD 
I GW I TGHS Y VT YGP LANGATS VLFEG I P TY P DVNRLWS I VDK Y K 
v j ivr x tH.xr i/ii.Kljij[ v iJVrtjUc.FV 1 J^oKASLQVLGTVGEPINPEA 
WLWYHRWGAQRCPIVDTFWQTETGGHKLTPLPGATPMKPGSAT 
FPFFGVAPAILNESGEELEGEAEGYLVFKQPWPGIMRTVYGNHE 
RFETTYFKKFPGYYVTGDGCQRDQDGYYWITGRIDDMLNVSGHL 
LSTAEVESALVEHEAVAEAAWGHPHPVKGECLYCFVTLCDGHT 
FSPKLTEELKKQIREKIGPIATPDYIQNAPGLPKTRSGKIMRRV 
LR K I AQN DHDLG DMS T VADPS V I S HLFS HRCLT I Q 


6423 


614 


1237 


ANLXE I PRDLPPETVLL YLDSNQ I TS IPNE I FKDLHQLR VLNLS 
KNG I E F I D EHAF KG VAE TLQTLDLS DNR IQS VHKNAFNNLKARA 

jvAiiiinNrwnwwv^ i UUU viiRo Plrto IN fill 1 >UiN v J. t» & A oVLD&HAGRP 

FLNAANDADLCNLPKKTTDYAMLVTMPGWFTMVISYWYYVRQN 
Q E D ARRHL E Y LKS LPS RQ KKAD E PDD I S TW 


6424 


1 


11B8 


KKVSWPVAAMVHCSCVLFRKYGNFIDKLRLFTRGGSGGMGYPRL 
GGEGGKGGDVWWAHNRMTLKQLKDRYPRKRFVAGVGANSKISA 
LKGSKGKDWEI PVPVGISVTDENGKI IGELNKENDRILVAQGGL 
GGKLLTNFLPLKGQKRIIHLDLKLIADVGLVGFPNAGKSSLLSC 
VSHAKPAIAD YAFTTLKPELGKIM YS DFKQI S VADL PGL I EGAH 
MNKGMGHKFLKHIERTRQLLFWDISGFQLSSHTQYRTAFETII 
LLTKELELYKEELQTKPALLAVNKMDLPDAQDKFHELMSQLQNP 
KDFLHLFEKNMIPERTVEFQHIIPISAVTGEGIEELKNCIRKSL 
DEOANOENDATjHKKOTiTjNTiWT ^DTM^QTPDDCinja^rrTCYMnTT 


6425 


1850 


1144 


LAMEGGGGIPLETLKEESQSRHVLPASFEVNSLQKSNWGFLLTG 
LVGGTLVAVYAVATPFVTPALRKVCLPFVPATMKQIENWKMLR 
CRRGSLVDIGSGDGRIVIAAAKKGFTAVGYELNPWLVWYSRYRA 
WREGVHGS AKFY I SDLWKVTFSQYSNWI FGVPQMMLQLEKKLE 
RELEDDARVIACRFPFPHWTPDHVTGEGIDTVWAYDASTFRGRE 
KRPCTSMHFQLPIQA 


642* 


30 


565 


SRGAAVGGMS VAGGE I RGDTGGEDTAAPGRFS FS PE PTLED I RR 
I.HAE FAAERDWEQ FHQPRNLLLALVGE VGELAELFQWKTDGEPG 
PQGWS PRERAALQEELSDVL I YLVALAARCR VDLPLAVLSKMDI 
NRRRYPAHLARSSSRKYTELPHGAISEDQAVGPADIPCDSTGQT 
ST 


" £427 


145 


959 


AASWGPPHVPKAGKMVSWMICRLWLVFGMLCPAYASYKAVKTK 
NIREYVRWM^WIVFALFMAAEIVTDIFISWFPFYYEIKMAFVL 
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W=Tryptophan, Y^Tyrosine, X-Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








WLLSPYTKGASLLYRKFVHPSLSRHEKEIDAYIVQAKERSYETV 
LS FGKRGLN I AASAAVQAATKSQGALAGRLRS FSMQDLRS I SDA 
PAPAYHDPLYLEDQVSHRRPPIGYRAGGLQDSDTEDECWSDTEA 
VP RAP AR PREKPLIRSQS LR WKR KP P VREGTS R S LKVRTRK KT 
VPSDVDS 


| 6428 


1982 


444 


SGSGGKMEDHOHVPIDIOTSKT J 'LDMT,vnPRwr'<;T.K'MnQT. i irr ttd" 

EKINAAIQDMPESEEIAQLLSGSYIHYFHCLRILDLLKGTEAST 

KNIFGRYSSQRMKDWQEIIAXjYEKDNTYLVELSSLLVRNVNYEI 

PSLKKQIAKCQQLQQEYSRKEEECQAGAAEMREQFYHSCKQYGI 

TGE^^WRGELLALVKDLPSQLAEIGAAAQQSLGEAIDVYQASVGF 

VCESPTEQVLPMLRFVQKRGNSTVYEWRTGTEPSWERPHLEEL 

P EQ VAEDA I D WG D FGVE AVS EGTD S G I S AEAAG IDWGIFPESDS 

KDPGGDGIDWGDDAVALQ I TVLEAGTQAPEGVARGPDALTLLEY 

TETRNQFLDELMELEIFLAQRAVELSEEADVLSVSQFQLAPAIL 

QGQTKEKMVTMVSVLEDLIGKLTSLQLQHLFMlLASPRYVDRVT 

EFLQQKLKQSQIiLALKKELMVQKQQEAr.,EEQAALEPKLDLLLEK 

TKELQKLIEADISKRYSGRPVNLMGTSL 


6423 


3413 


3442 


EPSSWTAAPRGPLAAHPLEAAVQEDDRRALSFDSRIKVFANGTL 
WKSVTDKDAGDYLCVARNKVGDDYWLKVD WMKPAKI EHKEE 
^HKVFYGGDLKVDCVATGLPNPEISWSLPDGSLVNSFMQSDDS 
GGRTKRYWFNNGTLYFNEVGMREEGDYTCFAENQVGKDEMRVR 
V IW V 1 AFA 1 1 KN KTCLjAVQVP YGDVVTVACEAKGEPMPKVTWLS 
PTNKVIPTSSEKYQIYQDGTLLIQKAQRSDSGNYTCLVRNSAGE " 
DRKTVWIHVNVQPPKINGNPNPITTVREIAAGGSRKLIDCKAEG 
I PTP R VL WA F P EG WLPAP Y YGNR I T VHGNG S LD I R S LR KSDSV 
QLVCMARNEGGEARLI VQLTVLEPMEKP I FHDPI S EKITAMAGH 
TISLNCSAAGTPTPSLVWVLPNGTDLQSGQQLQRFYHKADGMLH 
I SGLS S VDAGAYRCVARNAAGHTERLVS LKVGLKP EANKQYHNL 
VSIINGETLKLPCTPPGAGQGRFSWTLPNGMHLEGPQTLGRVSL 
LDNGTLTVREAS VFDRGT YVCRMETE YGPSVTS I PVI VI AYPPR 
ITSEPTPVIYTRPGNTVKLNCMAMGIPKADITWELPDKSHLKAG 
VQARL YGNRFLH P QG S LT I QHATQRDAG F YKCMAKN I LGS DS KT 
TYIHVF 


6430 


1946 


602 


RTRVSTGLRRTLLWSEAVGASSTRGDTGIPGSGEGGAGPGGGEG 
AMLEAMAEPSPEDPPPTLKPETQPPEKRRRTIEDFNKFCSFVLA 
YAGYIPPSKEESDWPASGSSSPLRGESAADSDGWDSAPSDLRTI 

QTFVKKAKSSKRRAAQAGPTQPGPPRSTFSRLQAPDSATLLEKM 
KLKDSLFDT J DGPKVA t ?PT. c ?PT < ?T.TT4TQPPDZiIiT tdwdt cnnm o 

HPPRKKDRKNRKLGPGAGAGFGVLRRPRPTPGDGEKRSRIKKSK 
KRKLKKAERGDRLPPPGPPQAPPSDTDSEEEEEEEEEEEEEEMA 
TWGGEAPVPVLPTPPEAPRPPATVHPEGVPPADSESKEVGSTE 
TSQDGDASSSEGEMRVMDEDIMVESGDDSWDLITCYCRKPFAGR 
PMIECSLCGTWIHLSCAKIKKTNVPDFFYCQKCKELRPEARRLG 
GPPKSGEP 


6431 


3 


605 


WWNSSYNLPAYAPYLPCEACAMQDGRKGGAYAGKMEATTAGVGR 
LEEEALRRKERLKALREKTGRKDKEDGEPKTKHLREEEEEGEKH 
RELRLRNYVPEDEDLKKRRVPQAKPVAVEEKVKEQLEAAKPEPV 
IEEVDLANLAPRKPDWDLKRDVAKKLEKLKKRTQRAIAELIRER 
LKGQEDS LASAVDAATEQKTCDSD 


6432 


56 


1692 


GGLGTMGSR I KQNPETTFEVYVE VAYPRTGGTLSDPEVQRQFPE 
D YS DOE VLQ TLTKFCFP F YVD SLT VS Q VGQNFTFVLTD I DS KQ R 
FGFCRLSSGAKSCFCILSYLPWFEVFYKLLNILADYTTKRQENQ 
WNE LLETLHKLP I PD PGVSVHLS VHSYFTVPDTREL PS I PENRN 
LTE Y FVAVDVNNMLHL YASML YERR I LI I CS KLSTLTACI HGSA 
AMLYPMYWQHVYIPVLPPHLLDYCCAPMPYLIGIHLSLMEKVRN 
MALDDW I LNVDTNTLETPFDDLQS LPNDVI S SLKNRLKKVS TT 
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Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T^Threonine, V=:Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\«possible nucleotide insertion) 








TGDGVARAFLKAQAAFFGSYRNALKIEPEEPITFCEEAFVSHYR 
SGAMRQFLQNATQLQLFKQFIDGRLDLLNSGEGFSDVFEEEINM 
GE YAGS DKL YHQ WLS TVRKGSGA I LNTVKTKANPAMKTVYKFD I 
AENGCAPTPEEQLPKTAPSPLVEAKDPKLREDRRPITVHFGQVR 
PPRPHWKRPKSNIAVEGRRTSVPSPEQNTIATPATLHILQKSI 
THFAAKFPTRGWTSSSH 


6433 


1524 


484 


APVTKRKEVFAKDSKGSALDAGRDPKRPALPETLCESGWASNTA ' 
PTTPPQPGWCLCGKDFKSSCQTPGREKERRLATMHGSCSFLMLL 
L PLLLL L VATTG P VGALTDE E KRLMVE LHNL YRAQ VS P TAS DML 
HMRWDEELAAFAXAYARQCVWGHNKERGRRGENLFAITDEGMDV 
PLAMEEWHHEREHYNLSAATCSPGQMCGHYTQWWAKTERIGCX3 
S H FCE KLQG VE ETN I EL L VCNYE P PGNVXG KR P YQ EGT P CS Q CP 
SG YHCKNSLCE P I GS PEDAQDLP YLVTEAPS FRATEASDSRKMG 
AEGPDKPSWSGLNSGPGHVWGPLLGLLLLPPLVLAGIF 


| 6434 


40 


2002 


MPQLNFGMADPTQMGGLSMLLLAGEHALGTPEVFSGTCRPDVSE "' 

SPELRQKSPLFQFAEISSSTSHSDASTKQCQTSALFQFAEISSN 

TSQLGGAEPVKRCGKSALFQLAEMCLASEGMKMEESKLIKAKES 

DGGRIKELEKGKEEKEIKMEKTDETRLQKEAEFEKSAKBNLRDS 

ICELRNFEAIiQIDDIMAIKMEDPKEIRKEELEEDHKCSHFPDFSY 

SASSKI IISDVPSRKDHMCHPHGIMI IEDPAALNKPEKLKKKKK 

KSKMDRHGNDKSTPKKTCKKRQSSESDIESVIYTIEAVAXGDWG 

I EKLGDT PRKKVRTS SSGKG S I LDAKP PKKKVKSRE KKMS KE KS 

SDTTKESRPPDFISISASKNISGETPEGIKAEPLTPMEDALPPS 

LSGQAKPEDSDCHRKIETCGSRKSERSCKGALYKTLVSEGMLTS 

LRANVDRGKRSSGKGNSSDHEGCWNEESWTFSQSGTSGSKKFKK 

TKPKEDCLLGSAKLDEEFEKKFNSLPQYS PVTFDRKCVPVPRKK 

KKTGNVSSEPTKTSKGSGDKWSNKQLFLDAIHPTEAI FSEDRNT 

MEPVHKVKNIPSIFNTPEPTTTARTFGGQPKEKSKENPDYSPCQ 

DTQRAGYHHEEVLWMTNLMNNCGGVYLKQLRHTAMTNA 


6435 


2227 


657 


ALQRDAAAAYAHPE YEERFLQEETVSQQINS IELLQTRPLALPE "' 

WKS QR P LQRQ VHLRGR PAS Q PTVT RG I T Y Y KAKVS E E END I E E 

QQDEFFSGDNGVDLLXEDQLLRHNGLMTSVTRRPAATRQGHSTA 

VTSDLNARTAPWSSALPQPSTSDPSIANHASVGPTLQTTSVSPD 

PTRESVLQPS PQVP ATTVAHTATQQ PAAP A P PAVS PREALMEAM 

HTVPVPPTTVRTDSLGKDAPAGRGTTPASPTLSPEEEDDIRNVI 

GRCKDTLSTITGPTTQNTYGRNEGAWMKDPLAKDERIYVTNYYY 

GNTLVEFRNLENFKQGRWSNSYKLPYSWIGTGHWYNGAFYYNR 

AFTRNI I KYDLKQRYVAAWAMLHDVAYEEATP WRWQGHSDVDFA 

VDENGLWLIYPALDDEGFSQEVIVLSKLNAADJjSTXJKETTWRTG 

LRRNFYGNCFVICGVIiYAVDSYNQRNANISYAFDTHTNTQIVPR 

LLFENEYFYTTQIDYNPKDRLLYAWDNGHQVTYHVIFAY 


6436 


1295 


341 


GACR P P VRQD PDSG PDY EALPAGAT VTTHMVAGAVAG I LEHC VM 
YPIDCVKTRMQSLQPDPAARYRITVLEALWRIIRTEGLWRPMRGL 
NVTATGAG PAHAL YFAC YE KL KKTLS DV I H PGGNSH I ANGAAGC 
w n * iiM r Jin v v itiriyro zst o Jr I HKV 1 JJ L. V KAVWQNEGAG 
AFYRSYTTQLTMNVPFQAIHFMTYEFLQEHFNPQRRYNPSSHVL 
SG ACAGAVAAAATTP LDVCKTLLNTQESLALNSHITGHI TGMAS 
AFRTVYQVGGVTAYFRGVQARVI YQI PSTAIAWSVYEFFKYLIT 
KRQEEWRAGK 


6437 


1828 


3*0 - 


P PAPAP PAS PARHVTRTARGHLEGGSRAP PLLQAVFLQ I KNMVK 
LIHTLADHGDDVNCCAFSFSLLATCSLDKTIRLYSLRDFTELPH 
SPLKFHTYAVHCCCFSPSGHILASCSTDGTTVLWNTENGQMLAV 
MEQPSGSPVRVCQFSPDSTCLASGAADGTWLWNAQSYKLYRCG 
SVKDGSLAACAFSPNGSFFVTGSSCGDLTVWDDKMRCLHSEKAH 
DLGITCCDFSSQPVSDGEQGLQFFRLASCGQDCQVKIWIVSFTH 
ILGFELKYKSTLSGHCAPVLACAFSHDGQMLVSGSVDKSVIVYD 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, F« Phenyl alanine, G=Glycine, 
H~Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TNTENILHTLTQHTRYVTTCAFAPNTLLLATGSMDKTVNIWQFD 
LETLCQARSTEHQLKQFTEDWSEEDVSTWLC&QDLKDLVGIFKM 
NNIDG KELLNLTKES LADDLKI ESLGLRS KVLRKI EELRTKVKS 
LSSGI PDE F I CP I TRELMKDPVI ASDGYS YEKE AMENWDPAKRN 
RTSPP 


*438 


109 


901 


EVQILRAKMFQTGGLIVFYGLLAQTMAQFGGLPVPLDQTLPLNV " 
NPALPLS PTGLAGSIjTNALSNGTiTiSnnT.T.nT TiFMT.PT .T.nTT.TfDr 1 

GGTSGGLLGGLLGKVTSVIPGLNNIIDIKVTDPQLLELGLVQSP 
DGHRL YVT 1 PLG I KLQVNTPLVGAS LLRLAVKLD ITAE I LAVRD 
KQERIHLVLGDCTHSPGSLQISLLDGLGPLPIQGLLDSLTGILN 
KVL PE LVQGNVC P LVNE VLRGLD I TL VHD I VNML I HGLQ F V I KV 


6439 


23 


412 


SIQTASAITTEMASQSQGIQQLLQAEKRAAEKVADARKRKARRL" 
KQAKE EAQMEVEQ YRREREHE FQS KQQAAKGS QGNLSAE VEQAT 
RRQVQGMQSSQQRNRERVLAQLLGMVCDVRPQVHPNYRISA 


6440 


3 


517 


RARWNSDMGDLPGLVRLS IALRIQPNDGPVFYKVDGQRFGQNRT 
TKIjTjTG^QYKVFV'KTK'PQTT/YV/FISJTQ Xm\n UDT t?t /cvunnrn 

RWYTGTYDTEGVTPTKSGERQPIQITMPFTDIGTFETVWQVKF 
YNYHKRDHCQWGSPFSVIEYECKPNETRSLMWVNKESFL 


6441 


234 


1373 


KSGGLRRRQRPGRSAAVGEEELPPGMEKFKAAMLLGSVGDALGY 
RNVCKENSTVGMKIQEELQRSGGLDHLVLSPGEWPVSDNTIMHI 
ATAEALTTDYWCLDDLYREMVRCYVEIVEKLPERRPDPATIEGC 
rtUiJ^rriiiviijij/iWil i fc'rJNolUaijljfcljAAi aAMuJLOJjRx WKPERLET 
LIEVSVECGRMTHNHPTGFLGSLCTALFVSFAAQGKPLVQWGRD 
MLRAVPLAEEYCRKTIRHTAEYQEHWFYFEAKWQFYLEERKISK 
DS ENKA I F PDN YDAE ERE KT YR KWS S EGRGGRRGHDAPM I AYDA 
LLAAGNSWTELCHRAMFHGGESAATGTIAGCLFGLliYGLDLVPK 

GLYODLEDK - FKTiFnT/3AAT»YT?T.QTFFTT 


6442 


34 


796 


AEDPAGGLAGQDTMFARGLKRKCVGHEEDVEGALAGLKTVSSYS 
LQRQSLLDMSLVKLQLCHMLVEPNLCRSVLIANTVRQIQEEMTQ 
DGTWRTVAPQAAERAPLDRLVSTEILCRAAWGQEGAHPASGLGD 
GHTOGP VSDLCP VTSAOAPFJfLOQ ciflWRHnRPDPNPr c puvct n 

QIFETLETKNPSCME3LFSDVDSPYYDLDTVLTGMMGGARPGPC 
EGLEGLAPATPGPSSSCKSDLGELDHWEILVET 


6443 


2 


555 


MASPAASSVRPPRPKKEPQTLVIPKNAAEEQKLKLERLMKNPDK 
AVPIPEKMSEWAPRPPPEFVRDVMGSSAGAGSGEFflVYRHIjRRR 
EYQRQDYMDAMAEKQKLDAEFQKRLEKNKIAAEEQTAKRRKKRQ 

VPSFTMGR 


6444 


390 


899 


G S T PRG KMRAP I P E P KPGDL I E I FRP FYRHWAI YVGDG YWHLA 
P P S E VAGAGAAS VM S ALTDKA I VKKE LL YDVAGSD K YQ VNNKH D 
DKYSPLPCSKIIQRAEELVGQEVLYKLTSENCEHFVNELRYGVA 
RSDQVRDVIIAASVAGMGLAAMSLIGVMFSRNKRQKQ 


6445 


2 


753 


AGAAGAAGAARSPRPQAHTKGVRGLPSRRRSPDCGRMELAAGSF 
SEEQFWEACAELQQPALAGADWQLLVETSGISIYRLLDKKTGLY 
EYKVFGVLEDCSPTLLADIYMDSDYRKQWDQYVKELYEQECNGE 
TWYWEVKY P FPMSNRD YVYLRQRRDLDMEGRKIHV 1 IARSTSM 
PQLGERSGVIRVKQYKQSLAIESDGKKGSKVTMYYFDNPGGQIP 
SWLINWAAKNGVPNFLKDMARACQNYLKKT 


6446 


1 


1651 


RCPTRSPPPDTPGSRGTTAMCSLASGATGGRGAVENEEDLPELS 
DSGDEAAWEDEDDADLPHGKQQTPCLFCNRLFTSAEETFSHCKS 
EHQFNIDSMVHKHGLEFYGYIKIilNFIRLKNPTVEYMNSIYNPV 
PWEKEEYLKPVLEDDLLLQFDVEDLYEPVSVPFSYPNGLSENTS 
WEKLKHMEARALSAEAALARAREDLQKMKQFAQDFVMHTDVRT 
CS S S TS V I ADLQ ED EDG VYFS S YGH YG I HEEMLKDKI RTE S YRD 
F I YQNPH I FKD KWLD VGCGTG I L S MFAAKAGAKKVLGVDQ S E I 
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Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine N-ARnaraai n<= 
P=Proline, Q=Glutamine, R«Arginine, 
S=Serine, T=>Threonine , V-Valine, 
W-Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LYQAMDI IRLNKLEDTITLI KGKIEEVHLPVEKVDVI ISEWMGY 
FLLFESMLDSVLYAKNKYLAKGGSVYPDICTISLVAVSDVNKHA 
DR I AFWDDVYG FKMS CMKKAVI PEA WE VLD PKTL I S E PCG I KH 
IDCHTTSISDLEFSSDFTLKITRTSMCTAIAGYFDIYFEKNCHN 
RWFSTGPQSTKTHWKQTVFL.LEKPFSVKAGEALKGKVTVHKNK 
KDPRS LTVTLTLNNS TQTYGLQ 


6447 


1554 


1068 


R LG P AE WH1) S GP CHATLGAANRG RALG VRAAWRGAP L CQR VMM P 
SRTNLATGIPSSKVKYSRLSSTDDGYIDLQFKKTPPKIPYKAIA 
LATVLFLI GAFL 1 1 IGSLLLS GY I SKGGADRAVPVLI IGI LVFL 
PGFYHLRIAYYASKGYRGYSYDDIPDFDD 


6448 


74 


559 


GQVLSHCYHYRSSRWRRGGLSRGRGAGVMALVPYEETTEFGLQK 
FHKPLATFSFANHTIQIRQDWRHLGVAAWWDAAIVLSTYLEMG 
AVELRGRSAVELGAnTRTiVfiTVaaT.T.apPTP VTrDTTNTMCT tvmt t?» 

QF I VRKVHYD PEKDVH I YEAQ KRNQKEDL 


6449 


597 


1876 


E YG VCENLRKLE ITGVS CRDV YAKLLHRYRH I LGL WQPD I GP YG 
GLLNWVDGIiFIIGWMYLPPHDPHVDDPMRFKPLFRIHIjMERKA 
ATVE CM YGH KG P HHGH I Q I VK KDE FSTKCN Q TDHHRM S GGRQE E 
* "^"OnUR i ucju x c ncitvnsiCtijL LtMK.r ± i 1 oyxL/NCLTYRRI 
YLPPSRPDDLIKPGLFKGTYGSHGLEIVMLSFHGRRARGTKITG 
DPNIPAGQQTVEIDLRHRIQLPDLENQRNFNELSRIVLEVRERV 
RQEQQEGGHEAGEGRGRQGPRESQPSPAQPRAEAPSKGPDGTPG 
EDGGEPGDAVAAAEQPAQCGQGQPFVLPVGVSSRNEDYPRTCRM 
C F YG TGL I AG HG FTS PE RTPG VF I L FDEDR FG F VWLELKS FS L Y 
SRVQATFRNADAPSPOAFDEMLICNTneiT TQ 


6450 


848 


269 


FVPAPRTVSGKRSLPGEWEERGEGEQRTGREFSGNGGRAVEAAR ""' 

MRLLCGLWLWLSLLKVLQAQTPTPLPLPPPMQSFQGNQFQGEWF 

VLGLAGNSFRPEHRALLNAFTATFELSDDGRFEVWNAMTRGQHC 

DTWSYVLIPAAQPGQFTVDHRVWTHEQAGRPQDQPAGQELVAAS 

RDAGPVHLPGQSSGPLG 


6451 


232 


939 


HSPTPPTSPRASTMEDVKLEFPSLPQCKEDAEEWTYPMRREMQE 

ILPGLFLGPYSSAMKSKLPVLQKHGITHIICIRQNIEANFIKPN 

FQQLFRYLVLDIADNPVENIIRFFPMTKEFIDGSLQMGGKVLVH 

GNAGISRSAAFVIAYIMETFGMKYRDAFAYVQERRFCINPNAGF 
VHGL0EYEAIYLAKLiTIOMM c ;PT.OTFT?QT.Q\rwcr"PTV-'CT von^uc 

EEDD FG TMQ VATAQNG 


6452 


1 


652 


RTRG ESS NM E P LAA YPLKCS G PRAKVFAVLLS I VL CTVTLFLLQ 
LKFLKPKINSFYAFEVKDAKGRTVSLEKYKGKVSLWNVASDCQ 
LTDRNYLGLKELHKEFGPSHFSVLAFPCNQFGESEPRPSKEVES 
FAR KN YG VT F P I FH KI K I LG S EGE PAFR PL VD <3 <5 TfTCP pdcjm i?u v 
YL VN P EG Q WKFWR PEEP I EV I RPD I AAL VRQ V 1 1 KKKEDL 


6453 


827 


223 


HRRWLPGLSMSPRRTLPRPLSLCLSLCLCLCLAAALGSAQSGSC" 
RDKKNCKWFSQQELRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FS YGMHR VETS CS Q CG AHLGH I FDDG PRPTG KR Y C INS AALS FT 
PADSSGTAEGGSGVAS PAQADKAEL 


6454 


827 


223 


HRRWLPGLSMSPRRTLPRPLSLCLSLCLCLCLAAALGSAQSGSC 
RDKKNCKWFSQQEIiRKRLTPLQYHVTQEKGTESAFEGEYTHHK 
DPGIYKCWCGTPLFKSETKFDSGSGWPSFHDVINSEAITFTDD 
FSYGMHRVETSCSQCGAHLGHIFDDGPRPTGKRYCINSAALSFT 
PADSSGTAEGGSGVAS PAQADKAEL 


6455 


1042 


173 


RVHLATVSASAAWDALGLPVRSHMQGSTRRMGVMTDVHRRFLQL 
LMTHGVLEEWDVKRLQTHCYKVHDRNATVDKLEDFINNINSVLE 
SLYIEIKRGVTEDDGRPIYALVNLATTSISKMATDFAENELDLF 
RKALEL I IDS ETG FAS STNI LNLVDQLKGKKMRKKEAEQVLQKF 
VQNKWLIEKEGEFTLHGRAILEMEQYIRETYPDAVKICNICHSL 
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Amino acid segment containing signal peptide 
{A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V« Valine, 
W«Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LIQGQSCETCGIRMHLPCVAKYFQSNAEPRCPHCNDYWPHEIPk' 
VFDPEKERESGVLKSNKKSLRSRQH 


6456 


2 


555 


R P Q S RS I S M WRNSLLQVS SGLR WLRVCAMVD I LGERHLVTCKGA 
TVEAEAALQNKWALYFAAARCAPSRDFTPLLCDFYTALVAEAR 
RPAPFEWFVSADGSSQEMLDFMRELHGAWLALPFHDPYRHELR 
KR YNVTAI P KLVI VKQNGEVI TNKGRKQ IRERGLACFQDWVEAA 
DIFQNFSV 


6457 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
KLFPLPLLWGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
1 1 LGKQYSLNI I LSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLF YNACFM I I PTL I I S VSTG 
DL QQ ATE FNQW KNWF I LQFLLS C F LGF LLM Y S TVLCS Y YNS AL 
TTAWGAIKNVSVAYIGILIGGDYIFSLLNFVGLNICMAGGLRY 
SFLTLSSQLKPKPVGEENICLDLKS 


6458 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
1 1 LGKOYSLNI I LSVFAI T LGAF T AAfSQTYT.AFTJT ,pnv t w\tt?t Kin 

IFTAANGVYTKQKMDPKELGKYGVLFYNACFMI IPTLI ISVSTG 
DLQQATEFNQWKNWFILQFLLSCFLGFLLMYSTVLCSYYNSAL 
TTAWGAI KNVSVAY I G I LIGGDY I FS LLNFVGLNICMAGGLR Y 
SFLTLSSQLKPKPVGEENICLDLKS 


6459 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKliHFPDFDKKIPV 
KLFPLPLLYVGNHISGLSSTSKLSLPMFTVLRKFTIPLTLLLET 
I ILGKOYSLNI ILSVFAI ILGAFIAAncinT.aPTJT.PnvT T7VT?t wn 

IFTAANGVYTKQKMDPKELGKYGVLFYNACFMI IPTLI ISVSTG 
DLQQATEFNQWKNWFILQFLLSCFLGFLLMYSTVLCSYYNSAL 
TTAWGAIKNVSVAYIGILIGGDYIFSLLNFVGLNICMAGGLRY 
S FLTLS SQLKP KPVGE EN I CLDLKS 


6460 


23 


892 


PTTGFPVTNFPWNWPDGKPPIMILYVSKLNKIIHFPDFDKKIPV 
KL F PL P LL YVGNH I SGLS STS KLS L P M FTVLRKFT I PLTLLLET 
1 1 LGKQYSLNI I LSVFAI ILGAFIAAGSDLAFNLEGYI FVFLND 
I FTAANGVYTKQKMDPKELGKYGVLF YNACFM I I PTLI IS VSTG 
DLQQATEFNQWKNWFILQFLLSCFLGFLLMYS TVLCS YYNSAL 

TTAWGAHCNV?VAYTnTT.Tf^RnVTT7QT.T MTT\7r*T MTPm^ppt nv 

SFLTLSSQLKPKPVGEENICLDLKS 


6461 


1653 


360 


LQQRTLR I TAVGQTHP I AWMA WE PS LGAF YGPAS F I TFVNCM YF 
LSIFIQLKRHPERKYELKEPTEEQQRLAANENGEINHQDSMSLS 
L I ST S ALENE HT FHS QLLGAS LTLLL Y VALWM FGALAVS L Y Y P L 
DLVFS FVFGATSLS FSAFFWHHCVNREDVRLAW IMTCCPGRSS 
YS VO VNVO P PNSNGTNGE AP KC PM ^ A P <; c; PTMK - C a. c c t?ttm<3 c r\ 
GCKLTNLQAAAAQ CHANS LPLNST PQLDNS LTEHS MDND I KMHV 
APLEVQFRTNVHSSRHHKNRSKGHRASRLTVLREYAYDVPTSVE 
G S VQNGLP KS RLGNN EGHS RS RRA YLAYRE RQYNP PQQDS S DAC 
STLPKSSRNFEKPVSTTSKKDALRKPAWELENQQKSYGLNLAI 
QNGPIKSNGQEGPLLGTDSTGNVRTGLWKHETTV 


*462 


3 


773 


SEELDREKKLKEDSPRKTPNKESGVPSLPVSLTSIKEEPKEAKH 
PDSQSMEESKLKNDDRKTPVNWKDSRGTRVAVSSPMSQHQSYIQ 
YLHAYPYPQMYDPSHPAYRAVSPVLMHSYPGAYLSPGFHYPVYG 
KMSGRE ETE KVNTS PS VNTKTTTES KALDLLQQHANQYRS KS PA 
PVEKATAEREREAERERDRHSPFGQRHLHTHHHTHVGMGYPLIP 
GQYDPFQGLTSAALVASQQVAAQASASGMFPGQRRE 


6463 


2 


350 


VI LC I LGG W I FKNADRSME KKKGEPRTRAEARP WVDEDLKDS SD 
LHQAEEDADEWQESEENVEHIPFSHNHYPEKEMVKRSQEFYELL 
NKRRSVRFISNEQVPMEVIDNVIRTAGL 


6464 


12 


1154 


GILRQKEREERNRIHKKEILFLEHLLWPSEMSSLSGKVQTVLG 
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Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, D=Aspartic Acid, E=* 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=>Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\*possible nucleotide insertion) 








LVEPSKLGRTLTHBHLAMTFDCCYCPPPPCQEAISKEPIVMKNIi 
YW I QKNAYSHKENLQLNQETEAI KEELL YFKANGGGALVENTTT 
GISRDTQTLKRLAEETGVHIISGAGFYVDATHSSETRAMSVEQL 
TDVLMNE I LHGADGTS I KCGI IGE I GCS WPLTESERKVLQATAH 
AQAQLGCPVIIHPGRSSRAPFQIIRILQEAGADISKTVMSHLDR 
T 1 LDKKELLE FAQLGC YLE YDLtf GTELLHYQLGPDI DMPDDNKR 
I RR VRLLVEEGCEDR I LVAHD I HTKTRLMKYGGHGYSH I LTNW 
P KMLLRG I TENVLDKI L I ENPKQWLTFK 


6465 


126 


1396 


KMTVFFKTLRNHWKKTTAGLCLLTWGGHWLYGKHCDJTLLRRAAC " 

Q E AQ VFGNQL I P PN AQ VKKATVFLN PAACKG KARTL F E KN AAP I 

LHLSGMDVTIVKTDYEGQAKKLLELMENTDVIIVAGGDGTLQEV 

VTG VLRRTDEATFS KI P IGFI PLGETSSLSHTLFAESGNKVQH I 

TDATLAIVKGETVPLDVLQIKGEKEQPVFAMTGLRWGSFRDAGV 

KVSKYWYLEPLKIKAAHFF3TLKEWPQTHQASISYTGPTERPPN 

EPEETPVQRPSLYRRILRRLASYWAQPQDALSQEVSPEVWKDVQ 

LSTIELSITTRNNQLDPTSKEDFLNICIEPDTISKGDFITIGSR 

KVRNPKLHVEGTECLQASQCTLLIPEGAGGSFSIDSEEYEAMPV 

EVKLLPRKLQFFCDPRKREQMLTSPTQ 


6466 


1134 


828 


VARGTELSQLEKAHPPADMGRRKSKRKPPPKKKMTGTLETQFTC 
PFCNHEKSCDVKMDRARNTGVISCTVCLEEFQTPITYLSEPVDV 
YSDW I DACEAANQ 


6467 


301 


2571 


GELRVLALAHGELACHAVLTASLLSLRSRLMDSDMDYERPNVET 
IKCVWGDNAVGKTRLICARACNATLTQYQLLATHVPTVWAIDQ 
YRVCQEVLERSRDWDDVSVSLRLWDTFGDHHKDRRFAYGRSDV 
WLCFS I ANPNSLHHVKTMW YPE I KHFCPRAP V I LVGCQLDLR Y 
ADLEAVNRARRPLARP I KPNEILPPEKGREVAKELGI PYYETSV 
VAQFG I KDVFDNAI RAAL I SRRHLQFWKSHLRNVQRPLLQAPFL 
PPKPPPPIT WPDPPQ^^FPPPAHT.T.PriDT ranvTT \tt /"nc-dii-dt 

FAHKIYLSTSSSKFYDLFLMDLSEGELGGPSEPGGTHPEDHQGH 
SDQHHHHHHHHHGRDFLLRAASFDVCESVDEAGGSGPAGLRAST 
SDGILRGNGTGYLPGRGRVLSSWSRAFVSIQEEMAEDPLTYKSR 
LMVWKMDSSIQPGPFRAVLKYLYTGELDENERDLMHIAHIAEL 
LEVFDLRMMVANI LNNEAFMNQE I TKAFHVRRTNR VKECLAKGT 
FSDVTFILDDGTISAHKPLLISSCDWMAAMPGGPFVESSTREW 
FPYTSKSCMRAVLEYLYTGMFTSSPDLDDMKLIILANRLCLPHL 
VALTEQYTVTGLMEATQMMVDIDGDVLVFLELAQFHCAYQLADW 
CLHHICTNYNNVCRKFPRDMKAMSPENQEYFEKHRWPPVWYLKE 
EDHYQRARKEREKEDYLHLKRQPKRRWLFWNSPSSPSSSAASSS 
SPSSSSAW 


6463 


3 


1374 


DAWAGTNMAALAPVGSPASRGPRLAAGLRLLPMLGLLQLLAEPG 
LGRVHHLALKDDVRHKVHLNTFGFFKDGYMWNVSSLSLNE PED 
KDVT I GFS LDRTKNDGFS S YLDEDVN YC I LKKQS VS VTLLI LD I 
SRSEVRVKSPPEAGTQLPKIIFSRDEKVLGQSQEPNVNPASAGN 
QTQKTQDGGKS KRS TVDS KAMGEKS FS VHNNGGAVS FQFFFNI S 
TDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
GEIPLPKLYISMAFFFFLSGTIWIHILRKRRNDVFKIHWLMAAL 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 
ITIALIGTGWAFIKHILSDKDKKIFMIVIPRRVLANVAYIIIES 
TEEGTTEYGLWKDSLFLVDLLCCGAILFPWWSIRHLQEASATD 
GKGKFSRAHFVLLSLL 


6469 


3 


1374 


DAWAGTNMAALAPVGS PASRGPRLAAGLRLLPMLGLLQLLAE PG 
LGR VHHLALKDDVRHKVHLNTFG F FKDG YM WNVS S LS LNE P ED 
KDVTIGFSLDRTKNDGFSSYLDEDVNYCILKKQSVSVTLLILDI 
SRSEVRVKSPPEAGTQLPKIIFSRDEKVLGQSQEPNVNPASAGN 
QTQKTQDGGKSKRSTVDSKAMGEKSFSVHNNGGAVSFQFFFNIS 
TDDQEGLYSLYFHKCLGKELPSDKFTFSLDIEITEKNPDSYLSA 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine f G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








GE I PLPKLY I SMAFFFFLSGT I W IHILRKRRND VFK IHWLMAAL 
PFTKSLSLVFHAIDYHYISSQGFPIEGWAWYYITHLLKGALLF 
ITIALIGTGWAFIKHILSDKDKKIFMIVIPRRVIiANVAYIIIES 
TEEGTTEYGLWKDSLFLVDLLCCGAILFPWWSIRHLQEASATD 
GKGKFSRAHFVLLSLL 


6470 


2726 


1437 


AAASGVSS RADAPVLAQS PAS AGNGRPS TPRVPGSRRHPS APRS 
GPLPREDGCRTPGPQLLPLPGALLRPRTLLSSAAETGRSRHPDT 
QHPSSGGRCRGGTESPSSAAGRPASMAEAEEDCHSDTVRADDDE 
ENE S P AE TDLQAQLQM FRAQWM FELAPG VS SSNLENR PCRAARG 
SLQKTSADTKGKQEQAKEEKARELFLKAVEEEQNGALYEAIKFY 
RRAMQLVPDIEFKITYTRSPDGDGVGNSYIEDNDDDSKMADLLS 
YFQQQLTFQESVLKLCQPELESSQIHISVLPMEVLMYIFRWWS 
SDLDLRSLEQLSLVCRGFYICARDPEIWRLACLKVWGRSCIKLV 
PYTSWREMFLERPRVRFDGVYISKTTYIRQGEQSLDGFYRAWHQ 
VEYYRYIRFFPDGHVMMLTTPEEPQSIVPRLRTR 


6471 


1750 


299 


fffdkmaaggsgvggkrssksdadsgflglrptsvdpalrrrrr 
gprnkkrgwrrlaqeplglevdqfledvrlqertsggllseapn 
ekl ffvdtgs ke kgltkkrtkvqkks lllkkplrvdl i lents k 
vpapkdvlahqvpnakklrrkeqlweklakqgelprevrraqar 
llnpsatrakpgpqdtverpfydlwasdnpldrplvgqdeffle 
qtkkkgvkrparlhtkpsqapavevapagasynpsfedhqtlls 
aahevelqrqkeaeklerqlalpateqaatqestfqelceglle 
e s dgegepgqgegpeagdae vcptparlattekkteqqrrre ka 
vhrlrvqqaalraarlrhqelfrlrgikaqvalrlaelarrqrr 
rqarr eaead k prrlgrl kyq a pd i d vqls s eltds lrt l kp eg 
nilrdrfksfqrrnmieprerakfkrkykvklvekrafreiql 


6472 


3 


8^7 


SCGSDRAQWAMEFPFDVDALFPERITVLDQHLRPPARRPGTTTP 
ARVDLQQQIMTIIDELGKASAKAQNLSAPITSASRMQSNRHVVY 
ILKDSSARPAGKGAIIGFIKVGYKKLFVLDDREAHNEVEPLCIL 
DFYIHESVQRHGHGRELFQYMLQKERVEPHQLAIDRPSQKLLKF 
LNKHYNLETTVPQVNNFVIFEGFFAHQHRPPAPSLRATRHSRAA 
AVDPT PAAPARKLP PKRAEGD I KP YS S SDREFLKVAVEP PWPLN 
RAPRRATP PAH P P P RSS S LGNS PERG PLRPFVP 


6473 


22 


912 


S SAVE F VWEGEKMAAEPNKTE I QTLF KRLRAVPTNKACFDCGAK 
NPSWAS ITYGVFLC IDCSGVHRSLGVHLS FIRSTELDSNWNWFQ 
LR CM Q VGGNANATAFFRQHG CTANDANT KYNSRAAQM YREKI RQ 
LGS AALARHGTDLW I DNMSS AVPNHS PE KKDSDFFTEHTQP PAW 
DAPATEPSGTQQPAPSTESSGLAQPEHGPNTDLLGTSPKASLEL 
KSS I IGKKKPAAAKKGLGAKKGLGAQKVSSQSFSEIERQAQVAE 
KLREQQAADAKKQAEESMVASMRLAYQELQIDR 


6474 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPSENGETKAEEIHISRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTEN 


6475 


3 


462 


LQRQRQHPAAAPAVPVRCFTFCFTDIVIMPKRKSPENTEGKDGS 
KVTKQEPTRRSARLSAKPAPPKPEPKPRKTSAKKEPGAKISRGA 
KGKKEEKQEAGKEGTAPSENGETKAEEIHISRSTVNVSTSRGTP 
PSTLSVKGQIETVRVKGTEN 


6474 


106 


1090 


AJRAMAQ YKGTMREAGRAMHLLKKRERQREQMS VLKQR I AEET I L 
KSQVDKRFSAHYDAVEAELKSSTVGLVTLNDMKARQEALVRERE 
RQLAKRQHLEEQRLQQERQREQEQRRERKRKISCLSFALDDLDD 
QADAAEARRAGNLGKNPDVDTSFLPDRDREEEENRLREELRQEW 
EAQREKVKDEEMEVTFSYWDGSGHRRTVRVRKGNTVQQFLKKAL 
QGLRKDFLELRSAGVEQLMFI KEDLILPHYHTFYDFI IARARGK 
SG PLFS FDVHDDVRLLSDATMEKDESHAG KWLRSW YE KNKH IF 
PASRWEAYDPEKKWDKYTIR 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine # K=Lysine, 
L=> Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6477 


227 


915 


LQGH LMG I MAAS RP L S R FW EWG KN I VCVGRNYADH VREMRS AVL 
SEP VLFL KP S TAYA P EG S P I LM P A YTRNLHH ELE LG VVMGKRCR 
AVP EAAAMD YVGGYALCliDMTARDVQDECKKKGLPWTLAKS FTA 
SCPVSAFVPKEKIPDPHKLKLWLKVNGELRQEGETSSMIFSIPY 
I IS YVSKI ITLEEGDI ILTGTPKGVGPVKENDEIEAGIHGLVSM 
TFKVEKPEY 


6478 


; 2 


1495 


F VS S RI LPE S LAS S EAS TLEAMGRK K KD DC ^ <? w Y ktjttm t v? w 
I FM E VLG SG AFS E VFL VKQRLTG KL F AL KC I KKS P AFRDS S LEN 
E I AVLKKI KHENI VTLEDI YESTTHYYLVMQLVSGGELFDRILE 
RGVYTEKDASLVIQQVLSAVKYLHENGIVHRDLKPENLLYLTPE 
ENSKIMITDFGLSKMEQNGIMSTACGTPGYVAPEVLAQKPYSKA 
VDCWS IGVI TYI LLCGYPPFYEETES KLFEKI KEGYYEFES PFW 
DD I SESAKDF I CHLLEKDPNER YTCEKALSHP W I DGNTALHRDI 
YPSVSLQIQKNFAKSKWRQAFNAAAVVHHMRKLHMNLHSPGVRP 
EVENRPPETOASETSRPSSPETTTTPAPVT.nM<5var diit tht dp 

QHGRRPTAPGGRSLNCLVNGSLHISSSLVPMHQGSr^AAGPCGCC 
SSCLNIGSKGKSSYCSEPTLLKKANKKQNFKSEVMVPVKASGSS 
HCRAGQTGVCLIM 


6479 


3 


949 


SCRGPGWHPAGGQAGAMELLSALSLGELALS FSRVPLFPVFDLS 
YFIVSILYLKYEPGAVELSRRHPIASWLCAMLHCFGSYILADLL 
LGEPLIDYFSNNSSILLASAVWYLIFFCPLDLFYKCVCFLPVKL 
IFVAMKEWRVRKIAVGIHHAHHHYHHGWFVMIATGWVKGSGVA 
LMSNFE QLLRG VW KP ETNE I LHMS F PT KASL YG AI L FTLQQTRW 
LP VS KAS L I F I FTLFMVS CKVFLTATHSHSS P FDALEG Y I C PVL 

FGSACGGDHHHDNHGGSHSGGGPGAQHSAMPAKSKEELSEGSRK 
KKAKKAD 


6480 


192 


514 


DFMSIYFPIHCPDYLRSAKMTEVMMNTQPMEEIGLSPRKDGLSY 
QI FPDPSDFDRCCKLKDRLPS I WEPTEGE VESGELRWPPEEFL 
VQEDEQDNCEETAKENKEQ 


6481 


110 


1131 


KS RMDLD WNM FV I AGGTLAI P I LAF VAS FLLW PS AL I R I Y YW Y 
WRRTLGMQVRYVHHEDYQFCYS FRGRPGHKPSI LMLHGFSAHKD 
MW LS WKFL PKNLH L VC VDM PGHEGTTRS S LDDLS I DGQ VKR I H 
Q FVE C LKLNKKP FHL VGTS MGG Q VAG VYAAY YP S DVS S LWLVCP 
AGLQYSTDNQFVQRLKELQGSAAVEKIPLIPSTPEEMSEMLQLC 
SYVRFKVPQQILQGLVDVRIPHNNFYRKLFLEIVSEKSRYSLHQ 
NMDKIKVPTQIIWGKQDQVLDVSGADMLAKSIANCQVELLENCG 
HSWMERPRKTAKLI IDFLASVHNTDNNKKLD 


6482 


2517 


568 


E P VS KVS QS RRKAG V PTAN I E E S QAVE AAt4AN VP WAE VCE KFQA 
ALALSRVELHKNPEKEPYKSKYSARALLEEVKALLGPAPEDEDE 
R PEAEDGPGAGDHALGLPAE WE PEG PVAQRAVRLAV I E FHLGV 
NHIDTEELSAGEEHLVKCr.RT.T.T?J?VPT.cjwnf*T<3T nnanMMT r*r 

LWSEREEIETAQAYLESSEALYNQYMKEVGSPPLDPTERFLPEE 
EKLTEQERSKRFEKVYTHNLYYLAQVYQHLEMFEKAAHYCHSTL 
KRQLEHNAYHPIEWAINAATLSQFYINKLCFMEARHCLSAANVI 
FGQTGKI SATEDTPEAEGEVPELYHQRKGEIARCW I KYCLTLMQ 
NAQLSMQDNIGELDLDKQSELRALRKKELDEEESIRKKAVQFGT 
GELCDAISAVEEKVSYLRPLDFEEARELFLLGQHYVFEAKEFFQ 
IDGYVTDHIEWQDHSALFKGLAFFETDMERRCKMHKRRIAMLE 
PLTVDLNPQYYLLVNRQIQFEIAHAYYDMMDLKVAIADRLRDPD 
SHI VKK I NNLNKS AL KY YQL FLD S LRDPNKVF PEH I G EDVLRPA 
MLAKFRVARLYGKIITADPKKELENLATSLEHYKFIVDYCEKHP 
EAAQEIEVELELSKEMVSLLPTKMERFRTKMALT 


6483 


3 


623 


NSHLLCGLRARAP^ANGREARAMEQRLAEFRAARKRAGLAAQP 
PAAS QGAQT PGE KAEAAATLKAAPG WLKR FLVWKPRPASARAQP 
GLVQEAAQPQGSTSETPWNTAI PLPS CWDQS FLTNITFLKVLLW 
LVLLGLFVELEFGLAYFVLSLFYWMYVGTRGPEEKKEGEKSAYS 
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VFNPGCEAIQGTLTAEQLERELQLRPLAGR 


6484 


201 


965 


QLAVKTKMSGLRPGTQVDPEIELFVKAGSDGESIGNCPFCQRLF 
MILWLKGVKFNVTTVDMTRKPEELKDLAPGTNPPFLVYNKELKT 
D F I KI E E FLE Q TLAP PRY PHLS P K YKE S FD VG CNLFAKFS AY I K 
NTQKEANKNFEKSLLKEFKRLDDYLNTPLLDEIDPDSAEEPPVS 
RRLFLDGDQLTLADCSLLPKLNI I KVAAKKYRDFDI PAEFSGVW 
RYLHNAYAREEFTHTCPEDKEIENTYANVAKQKS 


6485 


6 


1091 


FVDLVRAVEFLPCPDSQKLEKECQSSEESMGSNSMRSILEEDEE 
DEEPPRVLLYHEPRSFEVGMLVWHKHKKYPFWPAWKSVRQRDK 
KASVLYIEGHMNPKMKGFTVSLKSLKHFDCKEKQTLLNQAREDF 
NQDIGWCVSLITDYRVRLGCGSFAGSFLBYYAADISYPVRKSIQ 
QDVLGTKLPQLSKGSPEEPWGCPLGQRQPCRKMLPDRSRAARD 
RANQKLVEYIGKAKGAESHLRAILKSRKPSRWLQTFLSSSQYVT 
CVETYLEDEGQLDLVVKYLQGVYQEVGAKVLQRTNGDRIRFILD 
VLLPEAI I CA I SAGDE VDYKTAEE KYI KG PS LS YREKE I FDNQL 
LEERNRRRR 


6486 


10 


581 


LVLQAGGAHLSPSRVTQGIYYMLAFSEMPKPPDYSELSDSLTLA 
GGTGRFSGPLHRAWRMMNFRQRMGWIGVGLYLLASAAAFYYVFE 
ISETYNRLALEHIQQHPEEPLEGTTWTHSLKAQLLSLPFWVWTV 
IFLVPYLQMFLFLYSCTRADPKTVGYCIIPICLAVICNRHQAFV 
KASNQ I SRLQL I DT 


6487 


352 


863 


SFLKPLRGKMSVTLHTDVGDIKIEVFCERTPKTCENFLALCASN 
YYNGCIFHRNIKGFMVQTGDPTGTGRGGNSIWGKKFEDEYSEYL 
KHNVRG WSMANNG PNTNG S QFF I T YGKQ PHL DMKY T V FG KV I D 
GLETLDELEKLPVNEKTYRPLNDVHI KDITIHANPFAQ 


6488 


878 


241 


TALQEFGTSGPPLSLRFALPSGTGRFKPLPGARGPSWPPSPRVP 
ME PPNL YPVKLYVYDLS KGLARRLS P I MLGKQLEG IWHTS I WH 
KDEFFFGSGGISSCPPGGTLLGPPDSWDVGSTEVTEEIFLEYL 
SSLGESLFRGEAYNLFEHNCNTFSNEVAQFLTGRKIPSYITDLP 
S E VLS T PFGQALRPLLDS I Q I QP PGGS S VGRPNGQS 


6489 


1457 


375 


KVAKMATALSEEELDNEDYYSLLNVRREASSEELKAAYRRLCML 
YHPDKHRDPELKSQAERLFNLVHQAYEVLSDPQTRAIYDIYGKR 
GLEMEGWEWERRRTPAEIREEFERLQREREERRLQQRTNPKGT 
ISVGVDATDLFDRYDEEYEDVSGSSFPQIEINKMHISQSIEAPL 
TATDTAILSGSLSTQNGNGGGSINFALRRVTSAKGWGELEFGAG 
DLQGPLFGLKLFRNLTPRCFVTTNCALQFSSRGIRPGLTTVLAR 
NLDKNTVGYLQWHCSSPLLQVQRPHRNTRACAPEPSFRPFLHVP 
TWDAECSGARTPSTAWTSAAVKLREACLSGPGSGSHQLLLLTPR 
SKRRTGGG 


6490 


3 


1183 


HEAG CE VWLG YGP RAAAAAAAT VLFGGAG PTE TMF VARS IAADH 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCTASWKT 
HSGSVWRVTWAHPEFGQVLASCSFDRTAAVWEBIVGESNDKLRG 
QSHWVKRTTLVDS RTS VTDVKFAPKHMGLMLATCSADGI VR I YE 
APDVMNLSQWSLQHEISCKLSCSCISWNPSSSRAHSPMIAVGSD 
DSSPNAMAKVQIFEYNENTRKYAKAETLMTVTDPVHDIAFAPNL 
GRSFHILAIATKDVRIFTLKPVRKELTSSGGPTKFEIHIVAQFD 
NHNS Q VWR VS WN I TGTVLAS S GDDGCVRLW KANYM DNWKCTG I L 
KGNGSPVNGSSQQGTSNPSLGSNIPSLQNSLNGSSAGRKHS 


6491 


3 


1183 


HEAG CE VWLG YGPRAAAAAAATVLFGG AG PTETMFVARS IAADH ~ 
KDLIHDVSFDFHGRRMATCSSDQSVKVWDKSESGDWHCTASWKT 
H S GS VW R VT W AH P E FGQVLAS CS FDRTAAVWEE I VGE S ND KLRG 
QSHWVKRTTLVDS RTSVTDVKFAPKHMGLMLATCSADGIVR I YE 
APDVMNLSQWSLQHEISCKLSCSCISWNPSSSRAHSPMIAVGSD 
DSSPNAMAKVQIFEYNENTRKYAKAETLMTVTDPVHDIAFAPNL 
GRSFHILAIATKDVRIFTLKPVRKELTSSGGPTKFEIHIVAQFD 
NHNSQ W RVS WN I TGTVLAS S GD DG CVR LWKAN YMDNW KCTG I L 



511 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I^Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=*Valine, 
W=Tryptophan, Y»Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








KGNGS PVNGSSQQGTSNPSLGSNI PSLQNSLNGS SAGRKHS 


6492 


34 


2573 


I PFLKSCCCCCLFDFPPPPLDQVQEEECEVERVTEHGTPKPFRK ' 

FDSVAFGESQSEDEQFENDLETDPPNWQQLVSREVLLGLKPCEI 

KRQEVINELFYTERAHVRTLKVLDQVFYQRVSREGILSPSELRK 

I FSNLEDI LQLH I GLNE QMKAVRKRNETS VI DQ I GEDLLT W FS G 

PGEEKLKHAAATFCSNQPFALEMIKSRQKKDSRFQTFVQDAESN 

PLCRRLQLKDI I PTQMQRLTKYPLLLDNIATYTEWPTEREKVKK 

AADHCRQILNYVNQAVKEAENKQRLEDYQRRLDTSSLKLSEYPN 

VEELRNLDLTKRKMIHEGPLVWKVNRDKTIDLYTLLLEDILVLL 

QKQDDRLVLRCHSKILASTADSKHTFSPVIKLSTVLVRQVATDN 

KALFVI SMSDNGAQI YELVAQTVS EKTVWQDLI CRMT^AS VKEQS 

TKPIPLPQSTPGEGDNDEEDPSKLKEEQHGISVTGLQSPDRDLG 

LESTLISSKPQSHSLSTSGKSEVRDLFVAERQFAKSQHTDGTLK 

EVGEDYQIAIPDSHLPVSEERWALDALRNLGLLKQLLVQQLGLT 

EKSVQEDWQHFPRYRTASQGPQTDSVIQNSENIKAYHSGEGHMP 

FRTGTGDIATCYSPRTSTESFAPRDSVGLAPQDSQASNILVMDH 

MIMTPEMPTMEPEGGLDDSGEHFFDAREAHSDENPSEGDGAVNK 

EE KD VN LR I S GN YL I LDG YD P VQE S S TD E E VAS S LTLQ PMTG I P 

AVESTHQQQHS PQNTHSDGAI S PFTPEFL VQQRWGAMEYS CFE I 

QSPSSCADSQSQIMEYIHKIEADLEHLKKVEESYTILCQRLAGS 
ALTDKHSDKS 


6493 


557 


1147 


T PARMA YQGS S TS DCMS KTLDS ASAH FAAS A WS AP VPS RS EVA 
KEQNTGHNNINGWQPSGTSKTLYSTNMALSSSPGISAVQLVRT 
VGHTTTNHLI PALCTSS PQTLPMNNS CLTNAVHLNNVS WS P VN 
VHINTRTSAPSPTALKLATVAASMDRVPKVTPSSAISSIARENH 
EPERLGLNGIAETTVAMEVT 


6494 


2425 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAIMSASAVYVLD 
LKGKVLICRNYRGDVDMSEVEHFMPILMEKEEEGMLSPILAHGG 
VRFMW I KHNNL YLVATSKKNACVSLVFS FL YKWQVFS EYFKEL 
EEESIRDNFVIIYELLDELMDFGYPQTTDSKILQEYITQEGHKL 
E TG AP R P P ATVTNA VS WRS EG I K YRKNE V FLDV I ES VNLL VS AN 
GNVLRSE I VGSI KMRVFLSGMPELRLGLNDKVLFDNTGRGKS KS 
VE LED VKFHQC VRLS R FENDRT I S FI PPDGE FELMS YRL'NTHVK 
PLIWIESVIEKHSHSRIEYMIKAKSQFKRRSTANNVEIHIPVPN 
DADS PKFKTTVGS VKWVPENSE I VWS I KS FPGGKE YLMRAHFGL 
PSVEAEDKEGKPPISVKFEIPYFTTSGIQVRYLKIIEKSGYOAL 
P WVR Y I TQNGD YQ LRTQ 


6495 


2425 


1052 


AVAGGARPCSTPSSPHRRCRRHRPRPLPRPPAAiMSASAVYVLD 
LKGKVLICRNYRGDVDMSEVEHFMPILMEKEEEGMLSPILAHGG 
VRFMW I KHNNL YL VATS KKNACVSLVFS FL YKWQVFSE YFKEL 
EEESIRDNFVIIYELLDELMDFGYPQTTDSKILQEYITQEGHKL 
ETGAPRPPATVTNAVSWRSEGIKYRKNEVFLDVIESVNLLVSAN 
GNVLRSE I VGS I KMRVFLSGMPELRLGLNDKVLFDNTGRGKSKS 
VELEDVKFHQCVRLSRFENDRTISFIPPDGEFELMSYRLNTHVK 
P L I W I E S V I E KHSHS R I E YM I KAKS Q FKRRS TANNVE I H I P VPN 
DADS PKFKTTVGS VKWVPENSE I VWS IKS FPGGKE YLMRAHFGL 
PSVEAEDKEGKPPISVKFEIPYFTTSGIQVRYLKIIEKSGYQAL 
PWVRYI TQNGD YQLRTQ 


6496 


247 


559 


LRAVSLLPLQLVLPEYSIHSLFCIMFLCAQEWLTLGLNVPLLFY 
HFWRYFHCPADSSELAYDPPWMNADTLSYCQKEAWCKliAFYLL 
S FFYYh YCM I YTL VS S 


6497 


1053 


352 


ANTQ I CRLCPRRHLHPPCGAKKGNGTEEDYNFVFKWL IGESGV 
GKTNLLSRFTRNEFSHDSRTTIGVEFSTRTVMLGTAAVKAQIWD 
TAGLERYRAITSAYYRGAVGALLVFDLTKHQTYAWERWLKELY 
DHAEAT 1 WMLVGNKSDLSQAREVPTEEARM FAENNGLLFLETS 
ALD S TNVELAFET VLKE I FAKVS KQRQNS I RTNAI TLG SAQAGQ 
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EPGPGEKRACCISL 


6498 


2636 


272 


SLRLCPWGTHLAGPTTMRLSSLLALLRPALPLILGLSLGCSLSL 
LRVSWIQGEGEDPCVEAVGERGGPQNPDSRARLDQSDEDFKPRI 
VPYYRDPNKPYKKVLRTRYIQTELGSRERLLVAVLTSRATLSTL 
AVAVNRTVAHHFPRLLYFTGQRGARAPAGMQWSHGDERPAWLM 
SETLRHLHTHFGADYDWFFIMQDDTYVQAPRLAALAGHbSINQD 
LYLGRAEEFIGAGEQARYCHGGFGYLLSRSLLLRLRPHLDGCRG 
D I LSARPDEWLGRCLI DS LG VGCVSQHQGQQ YRSFELAKNRDPB 
KEGS S AFLS AFAVHP VS EGTLM YRLHKRFSALELERAYS E I EQL 
QAQIRNLTVLTPEGEAGIiSWPVGLPAPFTPHSRFEVLGWDYFTE 
QHTFSCADGAPKCPLQGASRADVGDALETALEQLNRRYQPRLRF 
QKQRLLNGYRRFDPARGMEYTLDLLLECVTQRGHRRALARRVSL 
LRPLSRVE 1 LPMPYVTEATRVQLVLPLLVAEAAAAPAFLEAFAA 
NVLEPREHALLTLLLVYGPREGGRGAPDPFLGVKAAAAELERRY 
PGTRLAWLAVRAE AP S Q VR LMDWS KKH PVDT L F F LTT VWTRPG 
PEVLNRCRMNAISGWQAFFPVHFQEFNPALSPQRSPPGPPGAGP 
DPPSPPGADPSRGAPIGGRFDRQASAJ2GCFYNADYLAARARLAG 
ELAGQEEEEALEGLEVMDVFLRFSGLHLFRAVEPGLVQKFSLRD 
CSPRLSEELYHRCRLSNLEGLGGRAQLAMALFEQEQANST 


6499 


3 


2040 


SCSADTRPSGQAWPTVGLRAAAGAFRTGSPLALGPETPQVACLP " 
GHPPVRPQVSGGPGAMPDPAAHLPFFYGSISRAEAEEHLKLAGM 
ADGLFLLRQCLRSLGGYVLSLVHDVRFHHFPIERQLNGTYAIAG 
GKAHCGPAELCEFYSRDPDGLPCNLRKPCNRPSGLEPQPGVFDC 
LRDAM VRD YVRQT W KLEGE ALEQA IIS Q APQ VE KL I ATTAHE RM 
PWYHSSLTREEAERKLYSGAQTDGKFLLRPRKEQGTYALSLIYG 
KTVYHYLISQDKAGKYCI PEGTKFDTLWQLVE YLKLKADGLI YC 
LKEAC PNSSASNASGAAAPTLPAH PS TLTHPQRR I DTLNSDG YT 
PEPARITSPDKPRPMPMDTSVYESPYSDPEELKDKKLFLKRDNL 
LI AD I ELGCGNFGSVRQGVYRMRKKQIDVAI KVLKQGTB KADTE 
EMMREAQ IMHQLDNP YI VRL I GVCQAE ALMLVMEMAGGGP LHKF 
LVGKREEIPVSNVAELLHQVSMGMKYLEEKNFVHRDLAARWVLL 
VNRHYAKISDFGLSKALGADDSYYTARSAGKWPjLKWYAPECINF 
RKFS S RSDVWS YGVTM WEALS YGQKP YKKMKG P EVMAFI EQGKR 
MECPPECPPELYALMSDCWIYKWEDRPDFLTVEQRMRACYYSLA 
S KVEGP PGSTQKAEAACA 


6500 


1773 


72£ 


TGPTHASADAWGLVRSVTEWCANVRGNPCAAALSCPQAVLDAGK 
MLS E S S S FLKG VMLG S I FCAL I TMLGH I R I GHGNRMHHHEHHHL 
Q APNKE D I LK I S EDE RME LS KS FRV Y C 1 1 L VKP KD VS L WAAVKE 
TWTKHCDKAEFFSSENVKVFESINMDTNDMWLMMRKAYKYAFDK 
YRDQ YNW FFLARPTTFAI I ENLKYFLLKKD PSQP F YLGHT I KSG 
DLEYVGMEGGIVLSVESMKRLNSLIiNIPEKCPEQGGMIWKISED 
KQLAVCLKYAGVFAENAEDADGKDVFNTKSVGLS I KEAMTYHPN 
Q WEG CCSDMAVT FNGLTPNQMHVMM YG VYRLRAFGP Y FQ 


65-01 


1 


570 


LVGMSGGGTETPVGCEAAPGGGSKKRDSLGTAGSAHLIIKDLGE 
IHSRLLDHRPVIQGETRYFVKEFEEKRGLREMRVLENLKNMIHE 
TNEHTLPKCRDTMRDSLSQVLQRLQA/uNfDSVCRLQQREQERKKI 
HSDHLVASEKQHMLQWDNFMKEQPNKRAEVDEEHRKAMERLKEQ 
YAEME KDLAKFSTF 


6502 


213 


1650 


AGNKPDP WAGRNRTAVLPDVS VFHRED VGWWRSWLQQS YQAVKE 

KSSEALEFMKRDLTEFTQWQHDTACTIAATASWKEKLATEGS 

SGATEKMKKGLSDFLGVISDTFAPSPDKTIDCDVITLMGTPSGT 

AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSQFCLEEK 

KGEISELLVGSPSIRALYTKMVPAAVSHSEFWHRYFYKVHQLEQ. 

EQARRDALKQRAEQSISEEPGWEEEEEELMGISPISPKEAKVPV 

AKISTFPEGEPGPQSPCEENLVTSVEPPAEVTPSESSESISLVT 

QIANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETGPSPP 
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I HS KPLT P AGHTG G P E P R P P AR VE T LRE EAP TDLR VFE XjNS DS G 
KSTPSNNGKKGSSTDISEDWEKDFDLDMTEEEVQMALSKVDASG 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6503 


213 


1650 


AGNKPDPWAGRNRTAVLPDVSVFHREDVGWWRSWLQQSYQAVKE 
KS S E ALE FM KRDLTE PTQWQHDTACTI AATAS WKEKLATEGS 
SGATEKMKKGLSDFLGVI SDTFAPS PDKTIDCDVITLMGTPSGT 
AEPYDGTKARLYSLQSDPATYCNEPDGPPELFDAWLSQFCLEEK 
KGEISELLVGSPSIRALYTKMVPAAVSHSEFWHRYFYKVHQLEQ 
EQARRDALKQRAEQS I SEEPGWEEEEEELMG IS PISPKEAKVPV 
AKISTFPEGEPGPQSPCEENLVTSVEPPAEVTPSESSESISLVT 
QIANPATAPEARVLPKDLSQKLLEASLEEQGLAVDVGETGPSPP 
IHSKPLTPAGHTGGPEPRPPARVETLREEAPTDLRVFELNSDSG 
KSTPSNNGKKGSSTDISEDWEKDFDLDMTBEEVQMALSKVDASG 
EVSGPGGSEGSEPNGPGCESSPQPAQLSPQEGPCSCLR 


6504 


2131 


1294 


GKVCLVAHWVCLSILSPPPAGMKTPNAQEAEGQQTRAAAGRATG 
SANMTKKKVSQKKQRGRPSSQPCRNIVGCRISHGWKEGDEPITQ 
WKGTVLDQVPINPSLYLVKYDGIDCVYGLELHRDERVLSLKILS 
D R VAS SHIS DANLANT 1 1 G KAVEHM FEGEHGS KDE WRGMVLAQA 
PIMKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSESPPTE 
REPGGWDGLIGKHVEYTKEDGSKRIGMVIHQVEAKPSVYFIKF 
DDDFH I YVYDLVKKS 


£505 


2131 


1294 


GKVCLVAHWVCLS I LS P P P AGMKT PNAQEAEGQQTRAAAGRATG " 

SANMTKKKVSQKKQRGRPSSQPCRWIVGCKISHGWKEGDEPITQ 

WKGTVLDQVPINPSLYLVKYDGIDCVYGLELHRDERVLSLKILS 

DR VAS SHIS DANL ANTI I G KAVEHM FEGEHG S KD E WRGMVLAQA 

P1MKAWFYITYEKDPVLYMYQLLDDYKEGDLRIMPESSBSPPTE 

REPGGWDGLIGKHVEYTKEDGSKRIGMVIHQVEAKPSVYFIKF 

DDDFHIYVYDLVKKS ' 


£506 ■ 


1 


1350 


EVSPPTSCCLTVAVADPGVSEGFRGFGAGCEMPGRGRCPDCGST 
ELVEDSHYSQSQLVCSDCGCWTEGVLTTTFSDEGNLREVTYSR 
STGENEQVSRSQQRGLRRVRDLCRVLQLPPTFEDTAVAYYQQAY 
RHSG I RAARLQKKE VL VGCCVL I TCRQHNWPLTMGAI CTLL YAD 
L D V FS S T YMQ I VKLLGLD VPS LCLAE L VKTYCS S F KL FQ AS P S V 
PAKYVEDKEKMLSRTMQLVELANETWLVTGRHPLPVITAATFLA 
WQSLQPADRLSCSLARFCKLANVDLPYPASSRLQELLAVLLRMA 
EQLAWLRVLRLDKRSWKHIGDLLQHRQSLVRSAFRDGTAEVET 
REKEPPGWGQGQGEG5VGNNSLGLPQGKRPASPALLLPPCMLKS 
PKRICPVPPVSTVTGDENISDSEIEQYLRTPQEVRDFQRAQAAR 
QAATSVPNPP 


6507 


1878 


929 


RS HAS RL P E LPS G CL VLQVQEL VQMSGMEAT VT I P I WQNKPHGA 
ARSVVRRIGTOLPLKPOUU^FETLPNISDLCIJ^VPPVPTIiAD 
IAWIAADEEETYARVRSDTRPLRHTWKPSPLIVMQRNASVPNLR 
GSEERLLALKKPALPALSRTTELQDELSHLRSQIAKIVAADAAS 
ASLTPDFLSPGSSNVSSPLPCFGSSFHSTTSFVISDITEETEVE 
VPELPSVPLLCSASPECCKPEHKAACSSSEEDDCVSLSKASSFA 
DMMGILKDFHRMKQSQDLNRSLLKEEDPAVLISEVLRRKFALKE 
EDISRKGN 1 


6508 


862 


342 


WEARKRPQRWPSERREVRVPPPHLQRGRSGLEPGTFRKMAAARP 
SLGRVLPGSSVLFLCDMQEKFRHNIAYFPQIVSVAARMLKNTTL 
D LLDRGLQ VHVWDACS S RSQ VDR LVALARMRQSG AFL S TS EGL 
ILQLVGDAVHPQFKEIQKLIKEPAPDSGLLGLFQGQNSLLH 


6509 


2 


1053 


F V WNPRGGR KRRRQAAVTQAATRAS GT P S P RDGTMTQG KLS VAN 
KAPGTEGQQQVHGEKKEAPAVPSAPPSYEEATSGEGMKAGAFPP 
APTAVPLHP S WA YVD P S S S S S YDNG F PTGDHELFTTFS WDDQKV 
RRVFVRKVYTILLIQLLVTLAWALFTFCDPVKDYVQANPGWYW 
ASYAVFFATYLTLACCSGPRRHFPIWLILLTVFTLSMAYLTGML 
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nucleotide 
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to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid; E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=»Isoleucine, KsLysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=>Glutamine, R=Arginine, 
S^Serine, T= Threonine , V=Valine, 
W=Tryptophan, ^Tyrosine, X»Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\-poooible nucleotide insertion) 








SS YYNTTS VLLCLG I TALVCLS VTVFS FQT KFDFTS CQGVLF VIj 
LMTLFFSGLILAILLPFQYVPWLHAVYAALGAGVFTLFLALDTQ 
LLMGNRRHSLSPEEYIFGAIiNIYLDIIYIFTFFLQLFGTNRE 


6510 


37 


1156 


PCALIDGCPQRGAVHPLLSSAMGLLAFLKTQFVLHLLVGFVFVVS - 

GLVINFVQLCTLALWPVSKQLYRRLNCRLAYSLWSQLVMLLEWW 

SCTECTLFTDQATVERFGKEHAVIILNHNFEIDFLCGWTMCERF 

GVLGSSKVLAKKELLYVPLIGWTWYFLE 1VFCKRK WEEDRDTW 

EGLRRLSDYPEYMWFLLYCEGTRFTETKHRVSMEVAAAKGLPVL 

KYHLLPRTKGFTTAVKCLRGTVAAVYDVTLNFRGNKNPSLLGIL 

YGKKYEADMCVRRFPLEDIPLDEKEAAQWLHKLYQEKDALQEIY 

NQKGMFPGEQFKPARRPWTLLNFLSWATILLSPIiFSFVLGVFAS 

GS PLLILTFLG FVG AGNGHCR 


6511 


2541 


1425 


GEEQPLAAAPTECLEQVIGGAGDPGTWASFPSPLPGPAPUCGGK 
TMATNFSDIVKQGYVKMKSRKLGIYRRCWLVFRKSSSKGPQRLE 
KYPDEKSVCLRGCPKVTEISNVKCVTRLPKETKRQAVAI I FTDD 
S AR T FTCDS E LE AE E W YK.TLS VE C LGSRLND I S LGE PDLLAPG V 
QCEQTDRFNVFLLPCPNLDVYGECKLQITHENIYLWDIHNPRVK 
LVSWPLCSLRRYGRDATRFTFEAGRMCDAGEGLYTFQTQEGEQI 
YQRVHSATLAIAEQHKRVLLEMEKNVRLLNKGTEHYSYPCTPTT 
MLPRSAYWHHI TGSQN I AEASS YAGEGYGAAQAS SETDLLNRFI 
LLKPKPSQGDSSEAKTPSQ 


| 6512 


159 


807 


FGKKSTWFPLSRSLRVASGRSCKLGHGGYTGSGPGFGEPRDSGA 
E V P S G S GRATG C ERGG VRG ARQGRAPGS S I WR KE PRMVCTR KTK 
TLVSTCVIl»SGMTNI I CLLYVGWVTNYIASVYVRGQEPAPDKKL 
EEDKGDTLKIIERLDHLENVIKQHIQEAPAKPEEAEAEPFTDSS 
LFAHWGQELSPEGRRVALKQFQYYGYNAYLSDRLPLDRP 


6513 


2 


756 


FVS PE PGFS LAQLNL I WQ LTDTKQ L VHS FAEGQDQGSA YANRTA 
L FP D LliAQGN AS LRLQRVR VAD EG S FTC FVS IRD FGS AA VS LQ V 
AAPYSKPSMTLEPNKDLRPGDTVTITCSSYQGYPEAEVFWQDGQ 
GVPLTGNVTTSQMANEQGLFDVHSILRWLGANGTYSCLVRNPV 
LQQ DAH SSVTITPQRS PTGAVE VQ V P EDPWAL VGTDATLR CS F 
S PE PG FS LAQLNLI WQLTDTKQLVHS FAEGQDQGS AYANRTALF 
PDLLAQGNASLRLQRVRVADEGSFTCFVSIRDFGSAAVSLQVAA 
PYSKPSMTLEPNKDLRPGDTVTITCSSYQGYPEAEVFWQDGQGV 
PLTGNVTTSQMANEQGLFDVHSILRWLGANGTYSCLVRNPVLQ 
QDAHSS VTITPQRS PTGAVE VQVPE DP WALVGTDATLRCS FS P 
EPGFSLAQLNLIWQLTDTRQLVHSFTEGR 


6514 


985 


302 


VGI PG P T I SSAAEMEDLLDLDEELR YS LATSRAKMGRRAQQES A 
Q AENHLNG KNS S LTLTG ETS SAKL P RCRQGGWAGDS VKAS KFRR 
KASEE I EDFRLRPQS LNGS DYGGDI P 1 1 PDLEEVQEEDFVLQVA 
APPS I Q I KRVMT YRDLDNDLMKYSAIQTLDGE I DLKLLTKVLAP 
EHE VR E RN PS WQDD VGWDWDHLFTE VS S E VLTE WDPLQTE KE DP 
AGQARHT 


6515 


1345 


305 


GRVGSRRRGAAVPGGCGAGS TQLEVSAS ASCGALGS ADMNP I Vv 
vn\j^jyjt\\3f l anJJKK.h.K.VHy(jM VKAA 1 Vb Y GILREGGSAVDAVEG 
A WAL EDD P E FNAG CGS VLNTNGEVEMDAS I MDGKDLS AG AVS A 
VQCIANPIKLARLVMEKTPHCFLTDQGAAQFAAAMGVPEIPGEK 
LVTERNKKRLEKEKHEKGAQKTDCQKNLGTVGAVALDCKGNVAY 
ATSTGG I VNKMVGRVGDS PCLGAGG YADNDIGAVSTTGHGES I h 
KVNLARLTL FH I E QG KT VE EAADLS LG YM KS RVKG LGGL I WS K 
TGDWVAKWTS TSMP WAAAKDGKLHFG I DPDDTTITDL P 


6516 


1 


1402 


FRRLRYLGQDATAAARDLRTRGLQGYCPSATARQQVLVSALQQL 
KGRRSEHRNENQEMPYSTNKELILGIMVGTAGISLLLLWYHKVR 
KPGIAMKLPEFLSLGNTFNSITLQDEIHDDQGTTVIFQERQLQI 
LEKLNELLTNMEELKEEIRFLKEAIPKLEEYIQDELGGKITVHK 
ISPQHRARKRRLPTIQSSATSNSSEEAESEGGYITANTDTEEQS 
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residue of 
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Predicted end 
nucleotide 
location 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide " 
(A=Alanine, C=Cy3teine, D=Aspartic Acid, E» 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
Li-jjeucine, M=Metnionine, N=Asparagme, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /..possible nucleotide deletion, 
\«poseible nucleotide insertion) 








FPVPKAFNTRVEELNLDVLLQKVDHLRMSESGKSESFELLRDHK 
EKFRDEIEFMWRFARAYGDMYELSTNTQEKKHYANIGKTLSERA 
INRAPMNGHCHLWYAVLCGYVSEFEGLQNKINYGHLFKEHLDIA 
IKLLPEEPFLYYLKGRYCYTVSKLSWIEKKMAATLFGKIPSSTV 
QEALHNFLKAEELCPGYSNPNYMYLAKCYTDLEENQNALKFCNL 
ALLL PTVTKEDKE AQ KEMQ K I MTS LKR 


6517 


3 


1414 


GRVWGGSSSLNAMVYVRGHAEDYERWQRQGARGWDYAHCLPYFR " 

KAQGHELGASRYRGADGPLRVSRGKTNHPLHCAFLEATQQAGYP 

LTEDMNGFQQEG FG WMDMT I HEG KR WSAACAYLHPALSRTNLKA 

EAETLVSRVr.FEGTRAVGVEYVKNGQSHRAYASKEVILSGGAIN 

S PQLLMLSG IGNADDLKKLG I P WCHL PG VGQNLQDH LE I Y I QQ 

ACTRPITLHSAQKPLRKVCIGLEWLWKFTGEGATAHLETGGFIR 

byPO»VPHPDIQFHFLPSQVIDHGRVPTQQEAYQVHVGPMRGTSV 

GWLKLRSANPQDHPVIQPNYLSTETDIEDFRLCVKLTREIFAQE 

ALAPFRGKELQPGSHIQSDKEIDAFVRAKADSAYHPSCTCKMGQ 

PSDPTAWDPQTRVLGVENLRWDASIMPSMVSGNLNAPTIMIA 

EKAADI IKGQPALWDKDVPVYKPRTLATQR 


1 6518 


242 


1098 


PAWNPGSEPRTRVRPRARSFPLPPPRAPRRRRHRLLRAVPGPSR ' 

RHRCRRRAPPPPSTMGDAGSBRSKAPSLPPRCPCGFWGSSKTMN 

LCSKCFADFQKKQPDDDSAPSTSNSQSDLFSEETTSDNNNTSIT 

TPTLSPSQQPLPTELNVTSPSKEECGPCTDTAHVSLITPTKRSC 

GTDSQSENEASPVKRPRLLENTERSEETSRSKQKSRRRCFQCQT 

KLELVQQELGSCRCGYVFCMLHRLPEQHDCTFDHMGRGREEAIM 

KM VKLDRKVGRS CQR I GEGCS 


6519 


3 


1113 


ERKMAEPPSPVHCVAAAAP'TATVSEKEPFGKLQLSSRDPPGSLS " 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPS PQS YGS PAS 
WS FAPLS AAPS PSSSRSSFS FS AGTAVPSSASAS LSQPG PRKLL 
VPPTLLHAQPHHLLLPAAAAAASANAKSRRPKEKREKERRRHGL 
GGAREAGGASREENGEVKPLPRDKIKDKIKERDKEKEREKKKHK 
VMNEIKKENGEVKILLKSGKEKPKTNIEDLQIKKVFCKKKKKKHK 
ENEKRKRPKMYSKSIQTICSGLLTDVEEX3AAKGILNDNIKDYVG 
KNLDTKNYDSKIPENSEFPFVSLKEPRVQNNLKRLDTLBFKQLI 
H I EHQ PNGGAS VI HCLQ 


6520 


3 


1113 


ERKMAEPPSPVHCVAAAAPTATVSEKEPFGKLQLSSRDPPGSLS 
AKKVRTEEKKAPRRVNGEGGSGGNSRQLQPPAAPSPQSYGSPAS 
WSFAPLSAAPSPSSSRSSFSFSAGTAVPSSASASLSQPGPRKLL 
VPPTLLHAQPHHLLLPAAAAAASANAKSRRPKBKREKERRRHGL 
GGAREAGGASREENGEVKPLPRDKI KDKI KERDKE KEREKKKHK 
VMNEIKKENGEVKILLKSGKEKPKTNIEDLQIKKVKKKKKKKHK 
ENEKRKRPKMYSKSIQTICSGLLTDVEDQAAKGILNBNIKDYVG 
KNLDTKNYDSKIPENSEFPFVSLKEPRVQNNLKRLDTLEFKQI,I 
H I EHQPNGGAS VI HCLQ 


6521 
6522 


184 

1042 


1798 
391 


KLFKMATDTSQGELVHPKALPLrVGAQLIHADKLGEKVEDSTMP 
IRRTVNSTRETPPKSKLAEGEEEKPEPDISSEESVSTVEEQENE 
TPPATSSEAEQPKGEPENEEKEENKSSPFTK'lfni7K"nncwr?irT7Trir 

VKKTIPSWATLSASQLARAQKQTPMASSPRPKMDAILTEAIKAC 
FQKSGASWAIRKYIIHKYPSLELERRGYLLKQALKRELNRGVI 
KQVKGKGASGSFWVQKSRKTPQKSRNRKNRSSAVDPEPQVKIiE 
DVLPLAFTRLCEPKEASYSLIRKYVSQYYPKLRVDIRPQLLKNA 
LQRAVERGQLEQITGKGASGTFQLKKSGEKPLLGGSLMEYAILS 
A I AAMN E PKTCSTTALKKYVLENH PGTNSN YQMHLL KKTLQKCB 
KNGWMEQISGKGFSGTFQLCFPYYPSPGVLFPKKEPDDSRDEDE 
DEDESSEEDSEDEEPPPKRRLQKKTPAKSPGKAASVKQRGSKPA 
PKVSAAQRGKARPLPKKAPPKAKTPAKKTRPSSTVIKKPSGGSS 
KKPATSARKE 

NKWLRPSPRSHRTPESGRVLSLFRLPPPGMALSGSTPAPCWEED 
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amino acid 
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amino acid 
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Predicted end 
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location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=*Phenylalanine, G^Glycine, 
H~Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ECLDYYGMLSLHRMFEWGGQLTECELELLAFLLDEAPGAAGGL " 
SRARSGLKLLLELERRGQCDESNLRLLGQLLRVLARHDLLPHLA 
RKRRRPVSPERYSYGTSSSSKRTEGSCRRRRQSSSSANSQQGSP 
PTKRQRRSRGRPSGGARRRRRGPQPHPSSSQSPPDLPLKAK 


6523 


2 


1097 


ASCQTRRRTAALDSGERIAGRRSPIALAMASNFNDIVKQGYVKI 
RS R KLG I FRR CWL VFKKAS S KG PRRLEKFPDEKAAYFRNFHKVT 
ELHNIKNITRLPRETKKHAVAI IFHDETSKTFACESELEAEEWC 
KHLCMECLGTRLND I SLGEPDLLAAGVQREQNERFNVYLMPTPN 
LD I YGECTMQ I THEN I YL WD I HNAKVKLVM W PLSS LRR YGRDS T 
WFTFESGRMCDTGEGLFTFQTREGEMIYQKVHSATLAIAEQHER 
LMLEMEQKARLQTSLTEPMTLSKSISLPRSAYWHHITRQNSVGE 
IYSLQGNHENRHSDLTGKSCKTSENRFLEENAPLVMYGITHHLF 
MDTSTCKWHDLE 


£524 


2 


1097 


ASCQTRRRTAALDSGERIAGRRSPIALAMASNFNDIVKQGYVKI 
RSRKLGIFRRCWLVFKKASSKGPRRLEKFPDEKAAYFRNFHKVT 
ELHNIKNITRLPRETKKHAVAI IFHDETSKTFACESELEAEEWC 
KHLCMECLGTRLNDISLGEPDLLAAGVQREQNERFNVYLMPTPN 
LDIYGECTMQITHENIYLWDIHNAKVKLVMWPLSSLRRYGRDST 
WFTFESGRMCDTGEGLFTFQTREGEMIYQKVHSATLAIAEQHER 
LMLEMEQKARLQTSLTEPMTLSKSISLPRSAYWHHITRQNSVGE 
IYSLQGNHENRHSDLTGKSCKTSENRFLBENAPLVMYGITHHLF 
MDTSTCKWHDLE 


6525 


1 


1859 


GESPFSEEESIEFNPSSSGRSARTVSSNSFCSDDTGWPSSQSVS 
PVKTPSDAGNSPIGFCPGSDEGFTRKKCTIGMVGEGSIQSSRYK 
KESKSGLVKPGSEADFSSSSSTGSISAPEVHMSTAGSKRSSSSR 
NRGPHGRSNGASSHKPGQQP^QPPPKTyr.T.QMT msKrriT co^na-ruia 

SYAPSSPSSSNSGSYKGSDCSPIMRRSGRYMSCGENHGVRPPNP 
EQYLTPLQQKEVTVRHLKTKLKESERRLHERESEIVELKSQLAR 
MREDW I EEE CHRVEAQLALKEARKE I KQLKQVI ETMRS S LADKD 
KG I Q K YFVD INI QN KKLE S LLQS MEMAHSGS LRDE LCLDF P CDS 
PE KS LTLNP PLDTM7VDGLS LEEQVTGEGADRELLVGDS I ANSTD 
LFDEIVTATTTESGDLELVHSTPGANVLELLPIVMGQEEGSWV 
ERAVQTDWPYSPAISELIQSVLQKLQDPCPSSLASPDESEPDS 
MES FPESLSALWDLTPRNPNSAI LLS PVETPYANVDAEVHANR 
LMRELDFAACVEERLDGVIPLARGGWRQYWSSSFLVDLLAVAA 
PWPTVLWAFSTQRGGTDPVYNIGALLRGCCWALHSLRRTAFR 
IKT 


6526 


2 


2034 


SGRAGEPEEWRGRQIIDSKETWIPFNSEDSQQLEEAYSSGKGCN 
GRWPTDGGRYDVHLGERMRYAVYWDELASEVRRCTWFYKGDKD 
NKYVP YSES FS Q VLEETYMLAVTLDE WKKKLES PNREI 1 1 LHNP 
KLMVHYQPVAGSDDWGSTPMEQGRPRTVKRGVENISVDIHCGEP 
LQIDHLVFWHGIGPACDLRFRSIVQCVNDFRSVSLNLLQTHFK 
KAQENQQIGRVEFLPVNWHSPLHSTGVDVDLQRITLPSINRLRH 
FTNDTI LDVFFYNS PTYCQTI VDTVAS EMNR I YTLFLQRNPDFK 
GGVSIAGHSLGSLILFDILTNQKDSLGDIDSEKGSLNIVMDQGD 
TPTLEEDLKKLQLS EFFDI FEKEKVDKEALALCTDRDLQE IG I P 
LGPRKKILNYFSTRKNSMGIKRPAPQPASGANIPKESEFCSSSN 
TRNGDYLDVGIGQVSVKYPRLIYKPEIFFAFGSPIGMFLTVRGL 
KRIDPNYRFPTCKGFFNIYHPFDPVAYRIEPMVVPGVEFEPMLI 
PHHKGRKRMHLELREGLTRMSMDLKNNLLGSLRMAWKSFTRAPY 
PALQASETPEBTEAEPESTSEKPSDVNTEETSVAVKEEVLPINV 
GMLNGGQRIDYVLQEKPIESFNEYLFALQSHLCYWESEDTVLLV 
LKEIYQTQGIFLDQPLQ 


6527 


1 


922 


GWVPLLSRILPSDACKIYKQGINIRLDTTllDFTDMKCfoRGDLS 
F I FNG D AAPSE S FWLDNE Q KVYQR I HHE E S EM ETEE E VD I LMS 
SDIYSATLSTKSISFTRAQTGWLFREDKTERVGNFLADFYLVNG 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D^Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I-Isoleucine , K=Lysine, 
LoLeucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
SaSerine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown , *=stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LVLESRKRREHLSEEDILRNKAIMESLSKGGNIMEQNFEPIRRQ 
S LT P P P QNT I TW E E Y I S AENG KAPH LGRELVCKE S KKT F KATI A 
MSQEFPLGIELLLNVLEWAPFKHFNKLREFVQMKLPPGFPVKL 
D I P V F P T I TATVTFQE FR YDE FDG S I FTI PDD YKE DPSR F PDL 


6528 


1 


1073 


LTGPAAAEPRCAADAGMKRALGRRKGVWLRLRKILFCVLGLYIA 
IPFLIKLCPGIQAKLIFLNFVRVPYFIDLKKPQDQGLNHTCNYY 
LQPEE DVT IG VWHTVP AVW WKNAQG KDQMWYED ALAS SHP 1 1 LY 
LHGN AG TRGGDHR VEL Y K VLS S LG YHWTFD YRG WGDS VGT P S E 
RGMTYDALHVFDWIKARSGDNPVYIWGHSLGTGVATNLVRRLCE 
RETPPDALILESPFTNIREEAKSHPFSVIYRYFPGFDWFFLDPI 
TSSGIKFANDENVKHISCPLLILHAEDDPWPFQLGRKLYS IAA 
PARSFRDFKVQFVPFHSDLGYRHKYIYKSPELPRILREFLGKSE 
PEHQH 


£529 




2215 


THIRYNKIGWKTMSCGNEFVETLKKIGYPKADNLNGEDFDWLF 
EGVEDESFLKWFCGNVNEQNVLSERELEAFSILQKSGKPILEGA 
ALDEALKTCKTSDLKTPRLDDKELEKLEDEVQTLLKLKNLKIQR 
RNKCQLMASVTSHKSLRLNAKEEEATKKLKQSQGILNAMITKIS 
NELQ ALTDEVTQ LMMFFRHSNLGQGTN P LVFLSQ F S LEKYL SQE 
EQS TAALTL YTKKQ FFQG I HE WE S S N E S QFFN FL KI QTP S I CD 
NQEILEERRLEMARLQLAYICAQHQLIHLKASNSSMKSSIKWAE 
ESLHSLTSKAVDKENLDAKI SSLTSE IMKLEKEVTQ t KDRSLPA 
WR ENAQ LLNM P WKGDFD LQ I AKQD YYTARQE L VLNQL I KQ KA 
SFELLQLSYEIELRKHRDIYRQLENLVQELSQSNMMLYKQLEML 
TDP S VS QQ INPRNT I DTKD YS THRLYQ VL3GENKKKELFLTHGN 
LEEVAE KLKQNI SLVQDQLAVS AQEHS FFLSKRNKD VDMLCDTL 
YQGGNQLLLSDQELTEQFHKVESQLNKLNHLLTDILADVKTKRK 
TLANNKLHQMEREFYVYFLKDEDYLKDIVENLETQSKIKAVSLE 
D 


6530 


128 


2986 


G AAHHG A I VQ VH P LL PGS S T I M I HDLCLVFPA P AKAWYVSD I Q 
ELY I R WDKVE I GKTVKAYVR VLDLHKKPFLAKYFPFMDLKLRA 
AS P 1 1 TLVALDEALDNYTI TFL I RGVAIGQTSLTA3 VTNKAGQR 
INSAPQQIEVFPPFRLMPRKVTLLIGATMQVTSEGGPQPQSNIL 
FS I SNESVALVSAAGLVQGLAIGNGTVSGLVQAVDAETGKWI I 
S QDLVQ VE VLLLRAVRI RAP I MRMRTGTQM P I YVTG I TNHQN P F 
S FGNAVPGLTFHWS VTKRDVLDLRGRHHEAS IRLPSQ YNFAMNV 
LGRVKGRTGLRAWKAVDPTSGQLYGLARELSDEIQVQVFEKLQ 
LLNPE I EAEQ I LMS PNS Y I KLQTNRDGAAS LS YR VLDG PE KVP V 
VHVDEKGFLASGSMIGTSTI EVI AQEPFGANQTI I VAVKVS PVS 
YLRVSMS P VLHTQNKEALVAVPLGMTVTFTVHFHDNS GDVFHAH 
S 3 VLNFATNRDD FVQ I G KG PTNNTCWRT VS VGLTLLR VWD AKH 
PGLSDFMPLPVLQAISPELSGAMWGDVLCLATVLTSLEGLSGT 
WS SSANS I LHIDP KTGVAVARAVGS VTVYYE VAGHLRTYKEVW 
S VPQR I MARHLHP IQTS FQEATAS KVI VAVGDRS SNLRGECTPT 
QREVIQALHPETLISCQSQFKPAVFDFPSQDVFTVEPQFDTALG 
QYFCSITMHRLTDKQRKHLSMKKTALWSASLSSSHFSTEQVGA 
EVPFSPGLFADQAEILLSNHYTSSEIRVFGAPEVLENLEVKSGS 
PAVLAFAKE KSFGWPS F ITYTVGVLDPAAGS QGPLSTTLTFS S P 
VTNQAIAIPVTVAFWDRRGPGPYGASLFQHFLDSYQVMFFTLF 
ALLAGTAVMIIAYHTVCTPRDLAVPAALTPRASPGHSPHYFAAS 
S PTS PNALP PARKAS P PSGLWS PAYASH 


6531 


845 


1425 


PSASIPPSASPDPVPDIRTCHFCLVEDPSVGCISGSEKCTISSS 
S LCMV I T I Y YDVKVR FJ VRG CGQ Y I S YR CQE KRNT Y FAE Y WYQA 
QCCQYDYCNSWSSPQLQSSLPEPHDRPLALPLSDSQIQWFYQAL 
NLSLPLPNFHAGTEPDGLDPMVTLSLNLGLSFAELRRMYLFLNS 
SGLLVLPQAGLLTPHPS 


6532 


2 


954 


AAGPPSEWNQDSLFPEPEPGPAPQVLLGPQGPGLIKGVAPPTL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid ' 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, ReArginine, 
S^Serine, T=Threonine, V= Valine, 
w=Tryptophan, Y=Tyrosine, X= Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ITDSTGTHLVLTVTNKNAHSPGLSRGSPQQPSSQPGSPAPAPSA 
QMDLEHPLQPLFGTPTSLLKKEPPGYEEAMSQQPKQQENGSSSQ 
QMDDLFDILIQSGEISADFKEPPSLPGKEKPSPKTVCWSPLAAQ 
PSPSAELPQAAPPPPGSPSLPGRLEDFLESSTGLPLLTSGHDGP 
EPLSLIDDLHSQMLSSTAILDHPPSPMDTSELHFVPEPSSTMGL 
DLADGHLDSMDWLELSSGGPVLSLAPLSTTAPSLFSTDFLDGHD 
LQLHWDSCL 


6533 


179B 


373 


S TI S WLARVE P P RRSSGVG AARLRF PGGS RPLRARAC VLALAVL 
ALLERNNADSMS AHSMIiCER I AIAKEL I KRAES LS RS RKGG I EG 
GAKLCSKLKAELKFLQKVEAGKVAIKESHLQSTNLTHLRAIVES 
AENL E E WS VLH V FG YTDTLG E KQTLWD WANGGHT WVKA I G R 
KAEALHNI WLGRGO YGDKS 1 1 EOAKDFLOAQHnnPvn vcmdu t t 
FAFYNSVS S PMAEKLKEMG I S VRGD I VAVNALLDHPEELQ PS E S 
ESDDEG P ELLQVTRVDREN I LASVAF PTE I KVDVCKR VNLDI TT 
LITYVSALSYGGCHFIFKEKVLTEQAEQERKEQVLPQLEAFMKD 
KELFACESAVKDFQSILDTLGGPGERERATVLI KR INWPDQPS 
ERALRL VAS SKINSRSLTI FGTGDTL KAI TMTANS G F VRAANNQ 
GVKFSVFIHQPRALTESKEAliATPLPKDYTTDSEH 


6534 


47 


596 


KATRFISAAFWLNKQGVSPAKLPHTSWSWSLQTLSFLFSGDLA 
EKSLQCFPCSAMLLELIPLLGIHFVLRTARAQSVTQPDIHITVS 
EGASLELRCNYSYGATPYLFWMERTVEEAFILLVCLKPWRVASS 
LEKKEKEDESFQLLLGSRYNVLKAHCLLPLIRWLTSGDSLLSAQ 
PHCPQGL 


6535 


250 


964 


LIKTFFRDVAIQRDLLPKEKNLETLLTLAFLEIDKAFSSHARLS 
ADATLLTSGTTATVALLRDGIELWASVGDSRAILCRKGKPMKL 
TIDHTPERKDEKERI FCKCGGFVAWNST.CJOPHVMfiPT.&MTP c mn 

LDLKTSGVIAEPETKRIKLHHADDSFLVLTTDGINFMVNSQEIW 
DFVNQCHDPNEAAHAVTEQAIQYGTEDNSTAVWPFGAWGKYKN 
SEINFSFSRSFASSGRWA 


6536 


242 


1174 


SLVKEMTNQYGILFKQEQAHDDAIWSVAWGTNKKENSETWTGS 
LDDLVKVWKWRDERLDLQWSLEGHQLGWSVDISHTLPIAASSS 
LDAH I RL W D LENG KO I KS I DAG P VD AWTLAF 3 pn Q n Y T .IVTTVruv 

GKVNIFGVESGKKEYSLDTRGKFILSIAYSPDGKYLASGAIDGI 
INIFDIATGKLLHTLEGHAMPIRSLTFSPDSQLLVTASDDGYIK 
IYDVQHANLAGTLSGHASWVLNVAFCPDDTHFVS3SSDKSVKVW 
DVGTRTCV1ITFFDHQDQVWGVKYNGNGSKIVSVGDDQEIHIYDC 
PI 


6537 


1638 


921 


NRFNPPPTQGPDPSLVYRPDVDPEVAKDKASFRNYTSGPLLDRV 
FTTYKLMHTHQTVDFVRSKHAQFGGFSYKKMTVMEAVDLLDGLV 
DESDPDVDFPNSFHAFOTAEGTRKAHPniCDWFHr.VRT.T.HnT rvxr 
LALFGEPQWAWGDTFPVGCRPOASWFCDSTFODKTPDljOnPPY 
STELGMYQPHCGLDRVLMSWGHDGEARGGQWGGGGRWGTVGGGG 
AEAVPAGDTLSPQSTCTR 


6538 


3345 


2412 


P Y L YD FLDAL I T CQTAP E EAF I KLDGLAGMLTEQLRRLTKQ VQE "' 

ARHNRDDEAI KXAVNE YDETMEKYI PVLMAQAXI YWNLEIOTPMV 

EKIFRKSVEFCNDHDWKLNVAHVLFMQENKYKEAIGFYEPIVK 

KHYDNILNVSAIVLANLCVSYIMTSQNEKAEELMRKIEKEEEQL 

SYDDPNRKMYHLCIVNLVIGTLYCAKGNYEFGISRVIKSLEPYN 

KKLGTDTWYYAKRCFLSLLENMSKHMIVIHDSVIQECVQFLGHC 

ELYGTNIPAVIEQPLEEERMHVGICNTVTDESRQLKALIYEIIGW 

NK 


6539 


216 


339 


FLGAASPHPHFSSLAPHPDQPEFTPVQDELEAMELWGPGV 


6540 


3 


391 


LERLWLLLLRRPEDAMAECPTLGEAVTDHPDRLWAWEKFVYLDE 
KQHAWLPLT I E I KDRLQLR VLLRRED WLGRPMTPTQ I G P S LLP 
IMWQLYPDGRYRSSDSSFWRLVYHIKIDGVEDMLLELLPDD 
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to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«po3sible nucleotide insertion) 


6541 


1165 


536 


RTLVQRRILMLLRKPARGRDLRGRGRGTPRGGRKGLLPTPDEFP 
RFEGGRKPDSWDGNREPGPGHEHFRDTPRPDHPPHDGHSPASRE 
RSSSLQGMDMASLPPRKRPWHDGPGTSEHREMEAPGGPSEDRGG 
KGRGGPGPAQRVPKSGRSSSLDGEHHDGYHRDEPFGGPPGSGTP 
SRGGRSGSNWGRGSNMNSGPPRRGASRGGGRGR 


6542 


3 


3775 


SWPRGRGETGGHPGALRTRTMQKSVRYNEGHALYLAFLARKEGT 
KRGFLSKKTAEASRWHEKWFALYQNVLFYFEGEQSCRPAGMYLL 
EGCSCERTPAPPRAGAGQGGVRDALDKQYYFTVLFGHEGQKPLE 
LRCEEEQDGKEWMEAIHQASYADILIEREVLMQKYIHLVQIVET 
EKIAANQLRHQLEDQDTEIERLKSEIIALNKTKERMRPYQSNQE 
DEDPDIKKIKKVQSFMRGWLCRRKWKTIVQDYICSPHAESMRKR 
NQ I VFTMVE AE S E YVHQL Y I LVNGF LR P LRMAAS S KKP P I SHDD 
VSSIFLNSETIMFLHEIFHQGLKARIANWPTLILADLFDILLPM 
LNIYQEFVRNHQYSLQVIiANCKQNRDFDKLLKQYEANPACEGRM 
LET FLTYPM FQ I PRY I I TLHELLAH TPHE HVE R KSLE FAKSKLE 
ELS R VMHD E VS DTEN I RKNIjA I E RM I V EG CD I LLDTS QTF I RQG 
SLIQVPSVERGKLSKVRLGSLSLKKEGERQCFLFTKHFLICTRS 
SGGKLHLLKTGG VLSL I DCTL I EE PDASDDDS KGSGQ VFGHLD F 
K I WE P PDRAAFTWLLAPSRQE KAAWMS D I S QCVDN I RCNGLM 
T I VFEENS KVTVPHM I KS DARLH KDDTDI CFS KTLNS CKVPQ I R 
YAS VERLLERLTDLRFLS IDFLNTFLHTYRI FTTAAWLGKLSD 
IYKRPFTS I P VRSLELFFATSQNNRGEKLVDG KS PRLCRKFSS P 

PAAS P PPHTGQ I PLDLSRGLS S PEQS PGTVEENVDNPRVDLCNK 
LKRS I QKAVLESAPADRAGVESSPAADTTELS PCRSPSTPRHLR 
YRQPGGQTADNAHCS VS PASAFAI ATAAAGHGS P PGFNNTERTC 
DKE F 1 1 RRTATNR VLNVLRH WVS KHAQDFELNNE LKMNVLNLLE 
EVLRDPDLLPQERKAAANILMALSQDDQDDIHLKLEDIIQMTDC 
MKAECFESLSAMELAEQITLLDHVIFRSIPYEEFLGQGWMKLDK 
NERT P Y I M KT S OH FNDM 53 NLVAS n T MW Y ATW <; P ft ma t tt irun/a \i 

AD I CRCLHNYNGVLE I TSALNRSAI YRLKKTWAKVSKQTKALMD 
KLQKTVSSEGRFKNLRETLKNCNPPAVPYLGMYLTDLAFIEEGT 
PNFTEEGLVNFSKMRMISHIIREIRQFQQTSYRIDHQPKVAQYL 
LDKDLI IDEDTLYELSLKIEPRLPA 


6543 


1857 


950 


FVSGCGRAG I GLS WAMAAEARVSRW YFGGLAS CGAACCTHPLDL 
LKVHLQTQQE V KLRMTGMALRWRTDG I LALYS GL S AS LCRQMT 
YS LTR FAI YETVRDRVAKG S QG PL P FHEKVLLG S VS GLAGG FVG 
TPADL VNVRMQND VKLPQGQRRNYAHALDGL YR VAREEGLRRL F 
SGATMASSRGALVTVGQLSCYDQAKQLVLSTGYLSDNIFTHFVA 
SFIAGGCATFLCQPLDVLKTRLMNSKGEYQGVFHCAVETAKLGP 
LAFYKGLVPAG I RL I PHTVLTFVFLEQLRKNFG I KVPS 


6544 


630 


79 


PSPCFIRSRLDGQPWMAGLEAWLSQNFSLHQPQSRVRVRRASIS 
EPSDTDPEPRTLNPSPAGWFVQQHPELELMSSFRERFGRNWLQY 
RS HLE P S GN PL PAT P TTS AP SAP PAS SQG PDTAPRP S P PQ BEAR 
GPQESPQKMSEEVRAEPQEEEEEKEGKEEKEEGEMAPLPEAHLG , 
EGKQKECP 


6545 


176 


560 


PPHSHAALLPAAMTPLLTLILVVLMGLPLAQALDCHVCAYNGDN 
CFNPMRCPAMVAYCMTTRTYYTPTRMKVSKSCVPRCFETVYDGY 
S KHAS TTS CCQ YDLCNGTGLATPATLALAP I LLATLWGLL 


6546 

... 


1657 


364 


HLLNGLDEVAAFFVADLGAIVRKHFCFLKCLPRVRPFYAVKCNS 
SPGVLKVLAQLGLGFSCANKAEMELVQHIGIPASKIICANPCKQ 
IAQI KYAAKHG I QLLS FDNEMELAKWKSH PSAKMVLC I ATDDS 
HS LS CLS LK FGVS L KS CRHLLENAKKHHVE WG VS FH I GSGC PD 
PQAYAQS I AD AR L VFEMGTE LGHKMHVLDLGGG FPGTEGAKVR F 
EEIASVINSALDLYFPEGCGVDIFAELGRYYVTSAFTVAVSIIA 
KKEVLLDQPGREEENGSTSKTiVYHLDEGVYGIFNSVLFDNICP 
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NO: 


Predicted 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
HsHistidine, I=Isoleucine, K=Lysine, 
Ij=Leucine, M=Methionine, NsAsparagine, 
P=Proline, Q=Glutamine ( R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W^Tryptophan, Y=Tyroeine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








TPILQKKPSTEQPLYSSSLWGPAVDGCDCVAEGLWLPQLHVGDW 
LVFDNMGAYTVGMGSPFWGTQACHITYAMSRVAWEALRRQLMAA 
EQEDDVEGVCKPLSCGWEITDTLCVGPVFTPASIM 


6547 


1 


541 


LHSKYIAPALCSQPGMMRCCRRRCCCRQPPHALRPLLLLPLVLL 
PPLAAAAAGPNRCDTIYQGFAECLIRLGDSMGRGGELETICRSW 
NDFHACASQVLSGCPEEAAAVWESLQQEARQAPRPNNLHTLCGA 
PVHVRERGTGSETNQETLRATAPALPMAPAPPLLAAAliALAYLL 
RPLA 


6548 


2 


219 


FVSRLSVRDVRFPTFLGGHGADAMHTDPDYSAAYVPIETDAEDG 
I KG CG I TFTLGKG TEVG ELKILSR FONA 


6549 


73 


1490 


ETGR VC ED AR P ACG S R S RRRRKE AAPG IPTPSPSSSSPTSSRPA 
ARAFSKAPARLSRPRAREEPPDPGRRYIQEEIIQARKHKLIKMC 
SSVAAKLWFLTDRRIREDYPQKEILRALKAKCCEEELDFRAVVM 
DEWLTIEQGNLGLRINGELITAYPQWWRVPTPWVQSDSDIT 
VLRHLE KMGCR LMNR PQ A I LNCVNKFWT FQE LAGHG VP L P DTF S 
YGGHENFAKMIDEAEVLEFPMWKNTRGHRGKAVFLARDKHHLA 
DLSHLIRHEAPYLFQKYVKESHGRDVRVIWGGRWGTMLRCST 
DGRMQSNCSLGGVGMMCSLSEQGKQLAIQVSNILGMDVCGIDLL 
MKDDGS FCVCEANANVG FI AFDKACNLDVAG I IADYAASLLPSG 
RLTRRMS LLS WSTAS ETS E PELGP PASTAVDNMSASSS S VDSD 


6550 


2293 


922 


FRVSRDGAPDCGIEQMGLAMEHGGSYARAGGSSRGCWYYLRYFF 
LFVSLIQFLII LGLVLFMVYGNVHVS TESNLQATERRAEGL YSQ 
LLGLTASQSNLTKELNFTTRAKDAIMQMWLNARRDLDR3NASFR 
QCQGDRVI YTNNQR YMAA 1 1 LS E KQCR DQ F KDMNKS CDALL FML 
NQKVKTLEVEIAKEKTICTKDKESVLLNKRVAEFOT.VFrvVTPF 
LQHQERQLAKEQLQKVQALCLPLDKDKFEMDLRNLWRDS 1 1 PRS 
LDNLGYNLYHPLGSELASXRRACDHNPSLMSSKVEELARSLRAD 
IERVARENSDLQRQKLBAQQGLRASQEAKQKVEKEAQAREAKLQ 
AECSRQTQLALEEKAVLRKERDNLAKELEEKKREAEQLRMELAI 
RN S ALDTC I KT KS Q PMM P VS RPMG P VPNPQ P I D PAS LEE F KR KI 
LESQRPPAGI PVAPSSG 


6551 


157 


748 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLADPLNKSSYKYE 
ADTVDLNWCVI SDMEVI ELNKCTSGQS FE VILKPPS FDGVPEFN 
ASLPRRRDPSLEE I QKKLEAAEERRKYQEAELLKHLAEKREHER 
E VIQKA I EENNNF I KMAKEKLAQKME SNKENREAHLAAMLERLQ 
EKDKHAE E VR KN KE LKE E AS R 


6552 


157 


748 


IQPPDPRNMTLAAYKEKMKELPLVSLFCSCFLADPLNKSSYKYE 
ADTVDLNWCVI SDME VIELNKCTSGQS FE VILKPPS FDGVPEFN 
ASLPRRRDPSLEE I QKKLEAAEERRKYQEAELLKHLAEKREHER 
EVIQKAI EENNNF I KMAKEKLAQKME SNKENREAHLAAMLERLQ 
EKDKHAEEVRKNKELKEEASR 


6553 

• 


2 


1807 


FV^SKMAAHLSYGRVNLNVLREAVRRELREFLDKCAGSKAIVWD 
EYLTGPFGLIAQYSLLKEHEVEKMFTLKGNRLPAADVKNI I FFV 
RPRLELMDIIAENVLSEDRRGPTRDFHILFVPRRSLLCEQRLKD 
LGVLGSFIHREEYSLDLIPFDGDLLSMESEGAFKECYLEGDQTS 
LYHAAKGLMTLQALYGTIPQIFGKGECARQVANMMIRMKREFTG 
SQNS I FP VFDNLLLLDRNVDLLTPLATQLT YEGLI DE I YGIQNS 
YVKLPPEKFAPKKQGDGGKDLPTEAKKLQLNSAEELYAEIRDKN 
FNAVGSVLSKKAKI ISAAFEERHNAKTVGE I KQFVSQLPHMQAA 
RGSIANHTSIAELIKDVTTSEDFFDKLTVEQEFMSGIDTDKVNN 
YIEDCIAQKHSLIKVLRLVCLQSVCNSGLKQKVLDYYKREILQT 
YGYEH I LTLHNLE KAGLLKPQTGGRNNY P T I RKTLRLWMDD VNE 
QNPTDISYVYSGYAPLSVRLAQLLSRPGWRSIEEVLRILPGPHF 
EERQPLPTGLQKKRQPGENRVTLIFFLGGVTFAEIAALRFLSQL 
EDGGTEYVIATTKLMNGTSWIEALMEKPF 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N-Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S^Serine , T=Threonine , V=Valine , 
W=Tryptophan, Y=Tyrosine, X«Unknown, +=Stop 
Codon, /=possible nucleotide deletion, 
\«pocsible nucleotide insertion) 


6554 


119 


1244 


FEMGSQ VS VES G ALHW I VGGG FGG I AAAS Q LQALNVP FML VDM 
KDSFHHNVAALRASVETGFAKKTFISYSVTFKDNFRQGLWGID 
LKNQMVLLQGGEALPFSHLiI LATGSTGP FPGKFNEVS S QQAAI Q 
AY E DM VRQ VQRS RF I VWGGG S AG VE MAAE I KTE YPE KE VTL I H 
S Q VALADKE LL P S VRQE VKE I LLR KGVQ LLL S E RVSNLE E LPLN 
EYREYIKVQTDKGTEVATNLVILCTGIKINSSAYRKAFESRLAS 
SGALRVNEHLQVEGHSNVYAIGDCADVRTPKMAYLAGLHANIAV 
AN I VNS VKQR PLQ AYKPGALTFLLS MG RNDG VGQ I SGF YVG RLM 
VRLTKSRDLFVSTSWKTMRQSPP 


6555 


1552 


498 


IHMALLRKINQVLLFLLIVTLCVIIiYKKVHKGTVPKNDADDESE 
TPEELEEEIPWICAAAGRMGATMAAINSIYSNTDANILFYWG 
LRNTLTRIRKWIEHSKLREINFKIVEFNPMGLKGKIRPDSSRPE 
LLQPLNFVRFYLPLLIHQHEKVIYLDDDVIVQGDIQELYDTTLA 
LGHAAAFSDDCDLPSAQDINRLVGLQNTYMGYLDYRKKAIKDLG 
I S P S T CS FN PG V I VANMTE WKHQR I TKQLE KWMQ KNVEENL YS S 
SLGGGVATSPMLIVFHGKYSTINPLWHIRHLGWNPDARYSEHFL 
QEAKLLHWNGRHKPWDFPSVHNDLWESWFVPDPAGIFKLNHHS 


6556 


241 


1449 


ASLCKGCFFVTHVLVIILPSLQSPPTFGFLLDIDGVLVRGHRVI 
PAALKAFRRLVNSQGQLRVPWFVTNAGNILQHSKAQELSALLG 
CEVDADQVILSHSPMKLFSEYHEKRMLVSGQGPVMENAQGLGFR 
NWTVDELRMAFPLLDMVDLERRLKTTPLPRNDFPRIEGVLLLG 
EPVRWETSLQLIMDVLLSNGSPGAGLATPPYPHLPVLASNMDLL 
WMAEAKMPRFGHGTFLLCLETIYQKVTGKELRYEGLMGKPSILT 
YQYAEDLIRRQAERRGWAAP I RKLYAVGDNPMSDVYGANLFHQY 
LQKATHDGAPELGAGGTRQQQPSASQSCISILVCTGVYNPRNPQ 
STEPVLGGGEPPFHGHRDLCFSPGLMEASHWNDVNEAVQLVFR 
KEGWALE 




2598 


1534 


RMCGRTSCHLPRDVLTRACAYQDRRGQQRLPEWRDPDKYCPSYN ~ 

KSPQSNSPVLLSRLHFEKDADSSERIIAPMRWGLVPSWFKESDP 

SKLQFNTTNCRSDTVMEKRSFKVPLGKGRRCWIiADGFYEWQRC 

QGTNQRQ P YFI Y F PQ I KTE KS GS IGAADS PENWE KVWDNWR LLT 

MAGIFDCWEPPEGGDVLYSYTIITVDSCKGLSDIHHRMPAILDG 

EEAVSKWLDFGEVSTQEALKLIHPTENITFHAVSSWNNSRNNT 

PECLAPVDLWKKELRASGSSQRMLQWIiATKSPKKEDSKTPQKE 

ESDVPQWSSQFLQKSPLPTKRGTAGLLEQWLKREKEEEPVAKRP 
YSQ 


6558 


21 


1138 


FHGRRRGGRKM E LGS CLEGGREAAEE EG E PE VKKRRLLC VE FAS 
VASCDAAVAQCFLAENDWEMERAIiNSYFEPPVEESALERRPETI 
SEPKTYVDLTNEETTDSTTSKISPSEDTQQENGSMFSLITWNID 
GLDLNNLSERARGVCSYLALYSPDVIFLQEVIPPYYSYLKKRSS 
NYEIITGHEEGYFTAIMLKKSRVKLKSQEtlPFPSTKMMRNLLC 
VH\mVSGNELCLMTSHLESTRGHAAERMNQIiKMVLKKMQEAPES 
ATV I FAG DTNLRDRE VTR CGG L PNN I VDVWE FLGKP KHCQ YTWD 
TQMNSNLGITAACKLRFDRIFFRAAAEEGHIIPRSLDLLGLEKL 
DCGRFP S DHWGLLCNLD I I L 


6559 


3 


364 


GPELSGLPTRPKKLKANQTPIAMDCCASRSCSVPTGPATTICSS 
DKSCRCGVCLPSTCPHTVWLLEPTCCDNCPPPCHIPQPCVPTCF 
LLNSCQPTPGLETLNLTTFTQPCCEPCIjPRGC 


65*0 


3 


1435 


TATSGGIWLRRKWRCHWPRPLPQSCVGTEGGLQVRDTSSRIAKG 
GVDHTKMSLHGASGGHERSRDRRRSSDRSRDSSHERTESQLTPC 
I RNVTS PTRQHHVE RE KDHS SSRPSSPRPQ K AS PNGS I S S AGNS 
SRNSSQSSSDGSCKTAGEMVFVYENAKEGARNIRTSERVTLIVD 
NTRFWDPS I FTAQPNTMLGRMFGSGREHNFTRPNEKGEYEVAE 
GIGSTVFRAILDYYKTGIIRCPDGISIPELREACDYLCISFEYS 
TIKCRDLSALMHELSNDGARRQFEFYLEEMILPLMVASAQSGER 
ECHIWLTDDDWDWDEEYPPQMGEEYSQ1IYSTKLYRFFKYIE 
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NO: 


Predicted 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H»Histidine, I=Isoleucine , K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, ^Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








NRDVAKSVLKERGLKKIRLGIEGYPTYKEKVKKRPGGRPEVIYN 
YVQRPFIRMSWEKEEGKSRHVDFQCVKSKSITNLAAAAADIPQD 
QLWMHPTPQVDELDILPIHPPSGNSDLDPDAQNPML 


6561 


3 


1086 


PGRRFRRKESSSSRWFPADCLLGLRGPASSLLSPEPSPSWPSHS 
PCPMAALTDLSFMYRWFKNCNLVGNLSEKYVFITGCDSGFGNLL 
AKQLVDRGMQVLAACFTEEGSQKLQRDTSYRLQTTLLDVTKSES 
I KAAAQ W VRD KVG E QG L W ALVNNAGVGL P S GPNE WLTKDD FVKV 
I NVNLVGLI E VTLHM L PMVKRARGRWNMS S SGGR VAVIGGG YC 
VSKFGVEAFSDSIRRELYYFGVKVCIIEPGNYRTAILGKENLES 
RMRKLWERLPQETRDSYGEDYFRIYTDKLKNIMQVAEPRVRDVI 
NSMEHAIVSRSPRIRYNPGLDAKLLYIPLAKLPTPVTDFILSRY 
LPRPADSV 


6562 


l 


1562 


MSTLYDIRAHKAQLLRFFASSDSNKALEQRRTLHTPKLEHLDRV 
LYEWFLGKRSEGVPVSGPMLIEKAKDFYEQMQLTEPCVFSGGWli 
WRFKARHGIKKLDASSEKQSADHQAAEQFCAFFRSLAAEHGLSA 
EQVYNADETGLFWRCLPNPTPEGGAVPGPKQGKDRLTVLMCANA 
TGSHRLKPLAIGKCSGPRAFKGIQHLPVAYKAQGNAWVDKEIFS 
DWFHHIFVPSVREHFRTIGLPEDSKAVLLLDSSRAHPQEAELVS 
SNVFTIFLPASVASLVQPMEQGIRRDFMRNFINPPVPLQGPHAR 
YNMNDAIFSVACAWNAVPSHVFRRAWRKLWPSVAFAEGSSSEEE 
LEAECFPVKPHNKSFAHILELVKEGSSCPGQLRQRQAASWGVAG 
REAEGGRPPAATSPAEWWSSEKTPKADQDGRGDPGEGEEVAWE 
QAAVAFDAVLR FAERQ P C FS AQ E VGQLRALRAVFRS QQQVRRRR 
GALGAWKVEALQEGPGGCGATAQS PLP CS S TAGDN 


6563 


1319 


2694 


LARPAQP VLLRE PEGAG P P VPAGHLVHHLQGGHLRERAHPDLEA 
HEHPLPCDQMFWRQMGGHLRMVEANSRGWWGIGYDHTAWVYTG 
GYGGGCFQGLASSTSNIYTQSDVKCVHIYENQRWNPVTGYTSRG 
LPTDRYMWSDASGLQECTKAGTKPPSLQWAWVSDWFVDFSVPGG 
TDQEGWQYASDFPASYHGSKTMKDFVRRRCWARKCKLVTSGPWL 
EVP P IALRDVS 1 1 PES PGAEGSGHS I ALWAVS DKGD VLCRLGVS 
ELNPAGSSWLHVGTDQPFASISIGACYQVWAVARDGSAFYRGSV 
Y P SQ P AGD CW YH I PS P PRQ RLKQ VS AGQTS VYALDENGNLW YRQ 
GITPSYPQGSSWEHVSNNVCRVSVGPLDQVWVIANKVQGSHSLS 
RG T VCHRTG VQ PHE PKGHG WD YG I GGGWDH I S VRANAT2APRSS 
SQEQEPSAPPEAHGPVCC 


6564 


1 


975 


APGSCALWSYCGRGWSRAMRGCQLLGLRSSWPGDLLSARLLSQE 
KRAAETHFGFETVSEEEKGGKVYQVFESVAKKYDVMNDMMSLGI 
HRVWKDLLLWKMHPIjPGTQLLDVAGGTGDIAFRFLNYVQSQHQR 
KQKRQLRAQQNLSWEEIAKEYQNEEDSLGGSRVWCDINKEMLK 
VGKQ KALAQGYRAGLAWVLGDAEELP FDDDKFDI YTI AFG I RNV 
THIDQAIiQEAHRVLKPGGRFLCLEFSQVNNPLISRLYDLYSFQV 
IPVLGEVIAGDWKSYQYLVESIRRFPSQEEFKDMIEDAGFHKVT 
YESLTSGIVAIHSGFKL 


6565 


1464 


999 


RSAVANGLTKRRMGLKLNGR Y I SL I LAVQIAYLVQAVRAAGKCD 
AVFKG FSDCLLKLGDSMANYPQGLDDKTNI KTVCTYWEDFHSCT 
VTALTDCQEGAKDMWDKLRKESKNLNIQGSLFELCGSGNGAAGS 
LLPAFPVLLVSLSAALATWLSF 


6566 


3 ! 


1385 


KYESAQPGGTQPEPGLGARMAIHKALVMCLGLPLFLFPGAWAQG 
H VP PG CSQG LNP L Y YNLCDRS GAWG I VLEAVAGAG I VTTFVLT I 
I LVAS L P FVQDTKKRS LLGTQVFFLLGTLGLFCLVFACVE KPDF 
STCAS RRFLFGVLFAI CFS CLAAHVFALN FLAR KNHG PRGWV I F 
TVALLLTLVEVI INTEWLI ITLVRGSGEGGPQGNS S AGWAVAS P 
CAIANMDFVMAL I YVMLLLLGAFLGAWPALCGR YKR WRKHG VFV 
LLTTATS VA I WWW I VMYTYGN KQHNS P TWDDPTLA IALAANAW 
AFVLFYVIPEVSQVTKSSPEQSYQGDMYPTRGVGYETILKEQKG 
QSMFVENKAFSMDEPVAAKRPVSPYSGYNGQLLTSVYQPTEMAL 
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L=Leucine, M=Methionine, N^Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=»possible nucleotide insertion) 








MHKVPS EGAYD 1 1 LPRATANSQVMGSANS TLRAEDM YSAQSHQA 
ATPPKDGKNSQVFRNPYVWD 


6567 


125 


863 


T KRS NLKA YAC S I HH I RTMS YVF VN DS S QTNVP LLQAC I DGDFN 
YSKRLLESGFDPNIRDSRGRTGLHLAAARGNVDICQLLHKFGAD 
LLATD YQGNTALHLCG HVDT I QFL VSNGL K I D I CNHQG AT P LVL 
AKRRGVNKDVIRLLESLEEQEVKGFNRGTHSKLETMQTAESESA 
MESHSLLNPNLQQGEGVLSSFRTTWQEFVEDLGFWRVIiLLIFVI 
ALLSLiGIAYYVSGVLPFVENQPELVH 


6568 


3 


1183 


HAS DRL L VL P DN Y S H FS QAS ANLQG P S RTTE L FH PTLAS I S S PM 
LEGAELYFNVDHGYLEGLVRGCKASLLTQQDYINLVQCETLEDL 
KIHLQTTDYGNFLANHTNPLTVSKIDTEMRKRLCGEFEYFRNHS 
LEPLSTFLTYMTCSYMIDNVILLMNGALQKKSVKEILGKCHPLG 
RFTEMEAVNIAETPSDLFNAILIETPLAPFFQDCMSENALDELN 
IELLRNKLYKSYLEAFYKFCKNHGDVTAEVMCPILEFEADRRAF 
IITLNSFGTELSKEDRETIiYPTFGKLYPEGLRliIiAQAEDFDQMK 
NVADHYGVYKPLFEAVGGSGGKTLEDVFYEREVQMNVLAFNRQF 
HYGVFYAYVKLKEQEIRNIVWIAECISQRHRTKINSYIPIL 


6569 


205 


1532 


RRRGPQRLGHGRPTPLLCRWRTAGPSHWEKQARAFQGLRPVDPR 
RMSWLFPLTKSASSSAAGSPGGLTSLQQQKQRLIESLRNSHSSI 
AEIQKDVEYRLPFTINNLTININILLPPQFPQEKPVISVYPPIR 
HHLMDKQGVYVTSPLVNNFTMHSDLGKIIQSLLDEFWKNPPVLA 
PTSTAFPYLYSNPSGMSPYASQGFPFLPPYPPQEANRSITSLSV 
ADTVSSSTTSHTTAKPAAPSFGVLSNLPLPIPTVDASIPTSQNG 
FGYKMPDVPDAFPELSELSVSQLTDMNEQEEVIjLEQFLTLPQLK 
QI ITDKDDLVKS IEELARKNLLLEPSLEAKRQTVLDKYELLTQM 
KSTFEKKMQRQHELSESCSASALQARLKVAAHEAEEESDNIAED 
FLEGKMEIDDFLSSFMEKRTICHCRRAKEEKLQQAIAMHSQFHA 
PL 


6570 


330 


1304 


ARLPRLTFLREGFLYVLLSHWVFVGAPRPPASDSWKKGLVPSAP 
PASRKMGS KALPAPI PLHPSLQLTNYS FLQAVNTFPATVDHLQG 
LYGLSAVQTMHMNHWTLGYPNVHEITRSTITEMAAAQGLVDARF 
PFPALPFTTHLFHPKQGAIAHVLPALHKDRPRFDFANLAVAATQ 
EDPPKMGDLSKLSPGLGSPISGLSKLTPDRKPSRGRLPSKTKKE 
F I CKFCGRH FT KS YNLL IHERTHTDE R P YTCD I CHKAFRRQDHL 
RDHRYIHSKEKPFKCQECGKGFCQSRTLAVHKTLHMQTSSPTAA 
SSAAKCSGETVICGGT 


£571 


169 


656 


A P DMN R KKLQKLTDTLTKNCKHL FRG FDKDNDG C VNVLEW I HGL 
SLFLRGSLEEKMKYCFEVFDLNGDGFISKEEMFHMLKNSLLKQP 
S EEDPDEG I KDLVE I TLKKMDHDHDGKLS FADYELAVREETLLL 
EAFGPCLPDPKSQMEFEAQVFKDPNEFNDM 


6572 


49 


1646 


TPERAQPGALLGAAGCCVCGGRWWPRSHERGYFSSAKMGSKRRN 
LS CSERHQ KLVDENYCKKLH VQAL KNVNS Q I RNQMVQNENDNRV 
QRKQFLRLLQNEQFELDMEEAIQKAEENKRLKELQLKQEEKLAM 
ELiAKLKHESLKDEKMRQQVRENS I ELRELEKKLKAAYMNKERAA 
QIAEKDAI KYEQMKRDAEIAKTMMEEHKRI I KEENAAE D KRN KA 
KAQYYLDLEKQLEEQEKKKQEAYEQLLKEKLMIDEIVRKIYBED 
QLEKQQKLEKMNAMRRYIEEFQKEQALWRKKKREEMEEENRKII 
EFANMQQQREEDRMAKVQENEEKRLQLQNALTQKLEEMLRQRED 
LEQVRQELYQEEQAEIYKSKLKEEAEKKLRKQKEMKQDFEEQMA 
LKELVLQAAKEEEENFRKTMLAKFAEDDRIELMNAQKQRMKQLE 
HRRAVEKLIEERRQQFLADKQRELEEWQLQQRRQGFINAIIEEE 
R LKLLKE HATNLLG YL P KG VF KKEDD I DLLGE E FRKVYQQRS E I 
CEEK 


6573 


767 


275 


GGGGGESQS FRAQDGTRTPATDCLMYLQGPRKLMTQGG YDMVQK ' 

LFLDFFRRRLSQRPTAEELEQRNILKPRNEQEEQEEKREIKRRL 

TRKLSQRPTVEELRERKILIRFSDYVEVADAQDYDRRADKPWTR 
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S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\=poesible nucleotide insertion) 








LTAAD KVS RGE C WR VGGRTVC WVS LGS PLGS V 


6574 


204 


1159 


LESSVPVSVGVFWACGVSWTGAAGLQDGALSDTMARNAEKAMTA 
LAR FRQAQLEEG KVKER R P FLASE CTELP JCAEKWRRQ 1 1 GE I S K 
KVAQ IQNAGLGEFR I RDLNDE INKLLREKGHWEVR I KELGG PD Y 
GKVGPKMLDHEGKEVPGNRGYKYFGAAKDLPGVRELFEKEPLPP 
PRKT RAE LM KAI D FE Y YG YLD E DD G V I VPLEQE YE KKLRAEL VE 
KWKAEREARLARGEKEEEEEEEEEINIYAVTEEESDEEGSQEKG 
GDDSQQKFIAHVPVPSQQEIEEALVRRKKMELLQKYASETLQAQ 
SEEARRLLGY 


6575 


117 


820 


S P ALAS QS GG I TE E KMLE PQENG VI DL PD Y E HVEDE TF P P FP P P " 

ASPERQDGEGTEPDEESGNGAPVPVPPKRTVKRNIPKLDAQRLI 

SERGLPALRHVFDKAKFKGKGHEAEDLKMLIRHMEHWAHRLFPK 

LQFEDFIDRVEYLGSKKEVQTCLKRIRLDLPILHEDFVSNNDEV 

AENNEHDVTSTELDPFLTNLSESEMFASELSISLTEEQQQRIER 

NKQLALERRQAKLP 


6576 


1 


1060 


PEPQALVGQKRGALRLLVARLVLTVSAPAEVRRRVLRPVLSWMD " 
R E T RALADS HFRGLG VDVPGVGQAPGR VA FVS E PGAF S YADFVR 
G FLL PNL P C V F S S AFTQG WGS RR RW VT PAG R P D FDHLLRT YGD V 
WP VAN CG VQE YNSNP KEHMTLRD Y I TY WKE Y I QAG YS S P RGC L 
YLKDWHLCRDFPVEDVFTLPVYFSSDWLNEFWDALDVDDYRFVY 
AGPAGSWSPFHADIFRSFSWSVNVCGRKKWLLFPPGQEEALRDR 
HGNLPYDVTS PALCDTHLHPRNQIAGPPLB ITQEAGEMVFVPSG 
WHHOVHNLVMCCFSCPLSGAFLnEDG^TTSPL^nPTrT rcwwnu&u 
G 


6577 


2271 


987 


SDRMASDDFDIVIEAMLEAPYKKEEDEQQRKEVKKDYPSNTTSS " 

TSNSGNETSGSSTIGBTSNRSRDRDRYRRRNSRSRSPGRQCRHR 

SRSWDRRHGSESRSRDHRREDRVHYRSPPLATGYRYGHSKSPHF 

REKSPVREPVDNLSPEERDARTVFCMQLAARIRPRDLEDFFSAV 

GKVRDVRIISDRNSRRSKGIAYVEFCEIQSVPLAIGLTGQRLLG 

VP 1 1 VQASQAEKNRLAAMANNLQKGNGGPMRL YVGSLHFNI TED 

MLRGIFEPFGKIDNIVLMKDSDTGRSKGYGFITFSDSECARRAL 

EQLNGFELAGRPMRVGHVTERLDGGTDITFPDGDQELDLGSAGG 

RFQLMAKLAEGAG I QL P S TAAAAAAAAAAQAAALQ LNGA VP LG A 

LNPAALTALSPALNLASQCLQLSSLFTPQTM 


" SS78 


377 


1489 


PSSSATMNRAPLKRATILHMALTGASDPSAEAEANGEKPFLLRA 
LQIALWSLYWVTSISMVFLNKYLLDSPSLRLDTPIFVTFYQCL 
VTTLLCKGLSALAACCPGAVDFPSLRLDLRVARSVLPLSWFIG 
MITFNNLCLKYVGVAFYNVGRSLTTVFNVLLSYLLLKQTTSFYA 
LLTCGI I IGGFWLGVDQEGAEGTLSWLGTVFGVLASLCVSLNAI 
YTTKVLPAVDGSIWRLTFYNNVNACILFLPLLLLLGELQALRDF 
AQLGSAHFWGMMTLGGLFGFAIGYVTGLQIKFTSPLTHNVSGTA 
KACAQTVLAVLYYEETKSFLWWTSNMMVLGGSSAYTWVRGWEMK 
KTPEEPSPKDSEKSAMGV 


6579 


2 


711 


R P PRVW Y P ELRE L S AAAPR WSHRTAPG I M VF YF TS S S VNS S AYT " 
I YMGKDKYENEDLI KHGWPED I WFHVDKLS SAHVYLRLHKGENI 
EDI PKEVLMDCAHLVKANS IQGCKMNNVNWYTPWSNLKKTADM 
DVGQIGFHRQKDVKIVTVEKKVNEILNRLEKTKVERFPDLAAEK 
E CRDR E ERNEKKAQ I Q EMKKRE KEEMKKKR E MDE LRS YSS LMKV 
ENMS SNQDGNDS DE FM 


6580 


62 


1571 


LVALKNWKPKGTNIPAPQSPVFGEAVSGVYMMTKVLGMAPVLGP 
RP PQEQVGP LMVKVEE fCEE KGKYLPS LEM FRQRFRQFG YHDTPG 
PREALSQLRVLCCEWLRPEIHTKEQILELLVLEQFLTILPQELQ 
AWVQEHCPESAEEAVTLLEDLERELDEPGHQVSTPPNEQKPVWE 
KISSSGTAKESPSSMQPQPLETSHKYESWGPLYIQESGEEQEFA 
QDPRKVRDCRLSTQHEESADEQKGSETVEGLKGDI IS VI IANKPE 
ASLERQCVNLENEKGTKPPLQEAGSKKGRESVPTKPTPGERRYI 
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S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








CAECGKAFSNSSNLTKHRRTHTGEKPYVCTKCGKAFSHSSNLTL 
HYRTHLVDRPYDCKCGKAFGQSSDLLKHQRMHTEEAPYQCKDCG 
KAFS GKGS L I RH YR I HTGEKP YQ CNE CGKS FS QHAGLS SHQRLH 
TGEKPYKCKECGKAFNHSSNFNKHHRIHTGEKPYWCHHCGKTFC 
SKSNLSKHQRVHTGEGEAP 


65 81 


228 


476 


RVFLKDLSSTPMASNNTASIAQARKLVEQLKMEANIDRIKVSKA 
AADLMAYCEAHAKEDPLLTPVPASENPFREKKFFCAIL 


6582 


1428 


718 


CFTTKTHCSPVSVPYLSPLVLRKELESLLENEGDQVIHTSSFIN 
QHPIIFWTLVWYFRRLDLPSNLPGLILTSEHCNEGVQLPLSSLS 
QDSKLVYIQLLWDNINLHQEPREPLYVSWRNFNSEKKSSLLSEE 
QQETSTLVET I RQS I QHNNVLKP I NLLS QQMKPGMKRQRS LYRE 
ILFLSLVSLGRENIDIEAFDNEYGIAYNSLSSEILERLQKIDAP 
PS AS VEWCRKCFGAPL I 


6583 


487 


41 


RIFSMTSGRLRWRCTWRPATALWSASLRLGTSSMHPSPRSISLP 
LSMMLSPL P SNTRGL S P TAL FRS P DSE H ATS C P RLH LWR CRAP L 

RSPSPLGRLQVLPRSPLHVHTHNSGKEVLGLQVQRSRSGTGPAC 
SQAGSGAVQGGNWCIF 


6584 


189 


1750 


PLPP^AAIX3PSSQ^TEYWRVPKNTTKKYNIMAFNAADKVNFAT 
WNQARLERDLSNKKIYQEEEMPESGAGSEFNRKLREEARRKKYG 
IVLKEFRPEDQPWLLRVNGKSGRKFKGIKKGGVTENTSYYIFTQ 
CPDGAFEAFPVHNWYNFTPLARHRTLTAEEAEEEWERRNKVLNH 
FSIMQQRRLKDQDQDEDEEEKEKRGRRKASELRIHDLEDDLEMS 
SDASDASGEEGGRVPKAKKKAPIAKGGRKKKKKKGSDDEAFEDS 
DDGDB'EGQEVDYMSDGSSSSQEEPESKAKAPQQEEGPKGVDEQS 
DSSEESEEEKPPEEDKEEEEEKKAPTPQEKKRRKDSSEESDSSE 
ESDIDSEASSAFFMAKKKTPPKRERKPSGGSSRGNSRPGTPSAE 
GGSTSSTLRAAASKLEQGKRVSEMPAAKRLRLDTGPQSLSGKST 
PQPPSGKTTPNSGDVQVTEDAVRRYLTRKPMTTKDLLKKFQTKK 
TGLSSEQTVNVLAQILKRLNPERKMINDKMHFSLKE 


6585 


3 


1678 


GPIRNSRIDDFVGGDPRAEASCSVLHSKPHAMADSRDPASDQMQ 
HWKE QRAAQ KAD VLTTG AGN P VG D KLNVI T VG P RG P LLVQD WF 
TDEMAHFDRERIPERWHAKGAGAFGYFEVTHDITKYSKAKVFE 
HIGKKTPIAVRFSTVAGESGSADTVRDPRGFAVKFYTEDGNWDL 
VGNNTP IFFIRDPILFPSFI HSQKRNPQTHLKD PDM VWDFWSLR 
PESLHQVS FLFSDRGI PDGHRHMNGYGSHTFKLVNANGEAVYCK 
FHYKTDQGIKNLSVEDAARLSQEDPDYGIRDLFNAIATGKYPSW 
TFY I QVMTFNQAETFP FNP FDLTKVWPHKDYPL I PVGKL VLNRN 
PVNYFAEVEQIAFDPSNMPPGIEASPDKMLQGRLFAYPDTHRHR 
LGPNYLHIPVNCPYRARVANYQRDGPMCMQDNQGGAPNYYPNSF 
GAPEQQPSALEHS I QYSGEVRRFNTANDDNVTQVRAFYVNVLNE 
EQRKRLCENIAGHLKDAQIFIQKKAVKNFTEVHPDYGSHIQALL 
DKYNAEKPKNAIHTFVQSGSHLAAREKANL 


6586 


32 


804 


PLPEQPAESTSTMPVSGTPAPNKKRKSSKLIMELTGGGQESSGL 
NLGKK I S VP RDVMLE ELS LLTNRGS KM F KLRQMR VE KF I Y ENH P 
D VFS DS S MDH FQ KFL PT VGGQLGT AG QG FS YS KSNGRGGSQAGG 
S GS AG Q YGS DQQHHLGSGS GAGGTGG PAGQAGRGGAAG TAG VGE 
TGSGDQAGGEGKH I TVFKT Y I S PWERAMGVDPQQKMELG I DLLA 
YGAKAELP KYKS FNRTAMP YGG YEKAS KRMTFQM PKV 


6587 


75 


1117 


RRVPSLGKMPECWDGEHDIETPYGLLHWIRGSPKGNRPAILTY "" 
HDVGLNHKLCFNTFFNFEDMQEITKHFWCHVDAPGQQVGASQF 
PQG YQ F P S MEQLAAMLPS WQH FG F KY V I G I G VG AGA Y VLAKFA 
LIFPDLVEGLVLVNIDPNGKGWIDWAATKLSGLTSTLPDTVLSH 
LFSQEELVNNTELVQSYRQQIGNWNQANLQLFWNMYNSRRDLD 
I NRPGT VPNAKTLRCPVMLWGDNAPAEDGWECNS KLDPTTTT 
FLKMADSGGLPQVTQPGKLTEAFKYFLQGMGYMPSASMTRLARS 
RTAS LTS AS S VDGS RP QACTHS ESS EGLGQVNHTME VS C 
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(A=Alanine, C=Cysteine, D»Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H«=Histidine, I=Isoleucine , K»Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6588 


137 


501 


LGLQAQLLELRTNNYQLSDELRKNGVELTSLRQKVAYLDKEFSK 
AQKALS KS KKAQEVE VLLS ENEMLQAKLHSQEED FRLQNSTLMA 
EFSKIiCSQMEQLEQENQQLKEGAAGAGVAQAGP 


6589 


2 


1405 


RPWGSAMATFSRQEFFQQLLQGCLLPTAQQGLDQIWLLLAICLA 
CRLLWRLGLPSYLKHASTVAGGFFSLYHFFQLHMVWWLLSLLC 
Y L VL FL CRHS S HRG VFL S VT I L I YLLMGEMHMVDTVTWHKMRGA 
QMIVAMKAVSLGFDLDRGEVGTVPSPVEFMGYLYFVGTIVFGPW 
I S FH S YLQ AVQGR P LS CRWLQ KVARSLALALL CLVL S TC VG P YL 
FPYFIPLNGDRLLRNKKRKARGTMVRWLRAYESAVSFHFSNYFV 
G FLS E ATATLAGAGFTEEKDHLEWDLTVSKPLNVEL PRSMVEW 
TSWNLPMSYWLNNYVFKNALRLGTFSAVLVTYAASALLHGFSFH 
LAAVLLSLAFITYVEHVLRKRLARILSACVLSKRCPPDCSHQHR 
LGLGVRALNLLFGALAI FHLAYLGSLFDVDVDDTTEEQG YGMA Y 
TVHKWSELSWASHWVTFGCWIFYRLIG 


6590 


2177 


656 


VRAYEHVLSLLENVFTPMFCHRDEYFRQLLRGAESPTRNSKLNR 
GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMLPNY 
GVAEGEDDFIEEGIWMEDDS PVEAVSTPNTPRNLAAWKISIPY 
VDFFEDPSSERKEKKERIPVFCIDVERNDRRAVGHEPEHWSVYR 
RYLEFYVLESKLTEFHGAFPDAQLPSKRIIGPKNYEFLKSKREE 
FQEYLQKLLQHPELSNSQLLADFLSPNGGETQFLDKILPDVNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSENNKKLFNDLFKNNANRAENTERKQNQNYFMEVMTVEGVY 
D Y LM YVGRW FQ VPDWLHHLLMGTR I L F KNTLEM YTD Y YLQCK L 
EQLFQEHRLVSLITLLRDAIFCENTEPRSLQDKQKGAKQTFEEM 
MNYIPDLLVKCIGEETKYESIRLLFDGLQQPVLNKQLTYVLLDI 
VIQELFPELNKVQKEVTSVTSWM 


6591 


2177 


j 656 


VRAYEHVLSLLENVFTPMFCHRDEYFRQtLRGAESPTRNSKLNR 
GSLSLDDFRNTQKRGESFGISRIGSKIKGVFKSTTMEGAMLPNY 
GVAEGEDDFIEEGIWMEDDS PVEAVSTPNTPRNLAAWKISIPY 
VDFFEDPSSERKEKKERIPVFCIDVERNDRRAVGHEPBHWSVYR 
RYLEFYVLESKLTEFHGAFPDAQLPSKRIIGPKNYEFLKSKREE 
FQEYLQKLLQHPELSNSQLLADFLSPNGGETQFLDKILPDVNLG 
KIIKSVPGKLMKEKGQHLEPFIMNFINSCESPKPKPSRPELTIL 
SPTSENNKKLFNDLFKNNANRAENTERKQNQNYFMEVMTVEGVY 
DYLM YVGRWFQ VPDWLHHLLMGTR I LFKNTLEMYTDYYLQCKL 
EQL FQEHRL VS L I TLLRDAI FCEN TE P RSLQ D KQ KG AKQT FEEM 
MNYIPDLLVKCIGEETKYESIRLLFDGLQQPVLNKQLTYVLLDI 
VIQELFPELNKVQKEVTSVTSWM 


6592 


3 


1861 


APEFLGSTISSGSMIDANLKLLQBAEQRLKAIVAEKFAIATKEG 
DLPQVERFFKIFPLLGLHEEGLRKFSEYLCKQVASKAEENLLMV 
LGTDMSDRRAAVI FADTLTLLFEG I ARI VETHQ P I VETYYGPGR 
L YTL I KYLQVE CDRQVE KWDKFI KQRDYHQQFRHVQNNLMRNS 
TTEKIEPRELDPILTEVTLMNARSELYLRFLKKRISSDFEVGDS 
MASEEVKQEHQKCLDKLLNNCLLSCTMQELIGLYVTMEEYFMRE 
TVNKAVALDTYEKGQLTSSMVDDVFYIVKKCIGRALSSSSIDCL 
CAMINLATTELESDFRDVLCNKLRMGFPATTFQDIQRGVTSAVN 
IMHSSLQQGKFDTKGIESTDEAKMSFLVTLNNVEVCSENISTLK 
KTLESDCTKLFSQGIGGEQAQAKFDSCLSDLAAVSNKFRDLLQE 
GLTELNSTAIKPQVQPWINSFFSVSHNIEEEEFNDYEANDPWVQ 
QFILNLEQQMAEFKASLSPVIYDSLTGLMTSLVAVELEKWLKS 
TFNRLGGLQFDKELRSLIAYLTTVTTWTIRDKFARLSQMATILN 
LERVTEILDYWGPNSGPLTWRLTPAEVRQVLALRIDFRSEDIKR 
LRL 


6593 


3 


1837 


EAFSAGSRRRGI.ALQRGVLGGLGGYCPCCCRRRGRLLVLLLLVR 
RGG EGGGGRGRG D KRRRRQ ARRQRRR P E PAEARGGKMADVLS VL 
RQYNIQKKE I WKGDE VI FGE FS W P KNV KTNYWWG TGKEGQ P R 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Ammo acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E» 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«Histidine, I«Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Gl.utamine, R=Arginine, 
S=Serine, T=Threonine, V=valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EYYTLDSILFLLNNVHLSHPVYVRRAATENIPWRRPDRKDLLG 
YLNGEAS TS AS I DRS APLE I GLQRSTQVKRAADE VLAEAKKPR I 
EDEECVRLDKERLAARLEGHKEGIVQTEQIRSLSEAMSVEKIAA 
IKAKIMAKKRSTIKTDLDDDITALKQRSFVDAEVDVTRDIVSRE 
RVWRTRTT I LQSTG KNFSKNI FAI LQS VKAREEGRAPEQRPAPN 
AAPVDPTLRTKQPIPAAYNRYDQERFKGKEETEGFKIDTMGTYH 
GMTLKSVTEGASARKTQTPAAQPVPRPVSQARPPPNQKKGSRTP 
I I IIPAATTSLITMLNAKDLLQDLKFVPSDEKKKQGCQRENETL 
IQRRKDQMQPGGTAISVTVPYRWDQPLKLMPQDWDRWAVFVQ 
GPAWQFKGWPWLLPDGSPVDIFAKIKAFHLKYDEVRLDPNVQKW 
DVTVLELSYHKRHLDRPVFLRVWETLDRYMVKHKSHLRF 


6594 


1 


1096 


EFPGRRFRGSQASPLCATCGPALLRAPTRAAMTRSLFKGNFWSA 
DILSTIGYDNIIQHLNNGRKNCKEFEDFLKERAAIEERYGKDLL 
NLSRKKPCGQSEINTLKRALEVFKQQVDNVAQCHIQIiAQSLREE 
ARKMEEFREKQKLQRKKTELIMDAIHKQKSLQFKKTMDAKKNYE 
QKCRDKDEAEQAVSRSANLVNPKQQEKLFVKLATSKTAVEDSDK 
AYMLHIGTLDKVREEWQSEHIKACEAFEAQECERINFFRNALWL 
HVNQLSQQCVTSDEMYEQVRKSLEMCSIQRDIEYFVNQRKTGQI 
PPAPIMYENFYSSQKNAVPAGKATGPNLARRGPLPIPKSSPDDP 
NYSLVDDYSLLYQ 


6595 


57 


781 


PLGTMSDSDLGEDEGLLSLAGKRKRRGNLPKESVKILRDWLYLH 
RYNAYPSEQEKLSLSGQTNLS VLQ I CNWFINARRRLLPDMLRKD 
GKDPNQFTI SRRGGKASDVALPRGSS PSVLAVS VPAPTNVLSLS 
VCSMPLHSGQGEKPAAPFPRGELESPKPLVTPGSTLTLLTRAEA 
GSPTGGLFNTPPPTPPEQDKEDFSSFQLLVEVALQRAAEMELQK 
QQDPSLPLLHTP I PLVSENPQ 


6596 


2 


1026 


PRLPVRRYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAIY 
GEEWCVIDDCAKIFCIRISDDIDDPKWTLCLQVMLPNEYPGTAP 
PIYQLNAPWLKGQERADLSNSLEEIYIQNIGESILYLWVEKIRD 
VL I Q KS QMTE PG PD VKKKTEEED VE CEDDL I LACQ PES S VKALD 
FDISETRTEVEVEELPPIDHGIPITDRRSTFQAHLAPWCPKQV 
KMVLSKLYENKKIASATHNIYAYRIYCEDKQTFLQDCEDDGETA 
AGGRLLHLMEILWKNWlVWSRWYGGII^PDRFKHINNCyUlN 
ILVEKNYTNSPEESSKALGKNKKVRKDKKRNEH 


6597 


2 


1026 


PRLPVIUIYHGRRRLQGRSRGHMAEGDAGSDQRQNEEIEAMAAIY ' 
GEEWCVIDDCAKIFCIRISDDIDDPKWTLCLQVMLPNEYPGTAP 
PIYQLNAPWLKGQERADLSNSLEEIYIQNIGESILYLWVEKIRD 
VL I Q KS QMTE PGPDVKKKTEEEDVE CEDDL I LACQ P ESS VKALD 
FDISETRTEVEVEELPPIDHGIPITDRRSTFQAHLAPWCPKQV 
KMVLS KL YENKK I ASATHNI YAYR I YCEDKQTFLQDCEDDGETA 
AGGRLLHLME I LNVKNVMVWS RW YGGI LLGPDRFKHINNCARN 
ILVEKNYTNSPEESSKALGKNKKVRKDKKRNEH \ 


6598 


1099 


419 


PRVRWATTMAMSFEWPWQYRFPPFFTLQPNVDTRQKQLAAWCSL 
VLSFCRLHKQSSMTVMEAQESPLFNNVKLQRKLPVESIQIVLEE 
JjKKJMjW Ltti WLiUKb Kb 5 FL IMWRRPEE WGKL I YQWVS RSGQNNSV 
FTLYELTNGEDTEDEEFHGLDEATLLRALQALQQEHKAEIITVS 
DGPRRQVLLAGTCLPLLLTSHLSRAFKRRQTQCPPKTGSVTPPD 
SKGLQS 


6599 


164 


1593 


KMAALTTLFKYIDENQDRYIKKLAKWVAIQSVSAWPEKRGE1RR 
MMEVAAADVKQLGGSVELVDIGKQKLPDGSEIPLPPILLGRLGS 
D P QKKT VC I YGHLD VQ P AALE DG WDS E P FTLVERDG KLHGRGS T 
DDKGPVAGWINALEAYQKTGQEIPVNVRFCLEGMEESGSEGLDE 
LIFARKDTFFKDVDYVCISDNYWLGKKKPCITYGLRGICYFFIE 
VE CSNKDLHSG V YGG S VHEAMTDL I LLMGS LVDKRGN I L I PG I N 
E AVAAVTE EE HKL YDD I DFD I EE FAKD VGAQ I LLH S HKKD I LMH 
RWRYPSLSLHGIEGAFSGSGAKTVIPRKWGKFSIRLVPNMTPE 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T^Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








WGEQVTSYLTKKFAELRSPNEFKVYMGHGGKPWVSDFSHPHYL 
AGRRAMKTVFGVEPDLTREGGSIPVTLTFQEATGKNVMLLPVGS 
ADDG AHS QNE KLNR YN Y I EGTKMLAAYL Y E VS QLKD 


6600 


2 


934 


PGRLFRVAAMESAGLEQLLRELLLPDTERIRRATEQLQIVLRAP 
AALSALCDLLASAADPQIRQFAAVLTRRRLNTRWRRLAAEQRES 
LKSLILTALQRETEHCVSLSLAQLSATIFRKEGLEAWPQLLQLL 
QHSTHSPHSPEREMGLLLLSWVTSRPEAFQPHHRELLRLLNET 
LGEVGSPGLLFYSLRTLTTMAPYLSTEDVPLARMLVPKLIMAMQ 
TLIPIDEAKACEALEALDELLESEVPVITPYLSEVLTFCLEVAR 
NVALGNAI R I R I LCCLTFLVKVKS KALLKNRLLATLAAHP FPHC 
GC 


6601 


529 


1420 


PRAAARAP P PAVLRRDRRAATAPGAGEMTLHG PLAQRYFLNH I E 
KITTWQDPRKAMNQPLNHMNLHPAVSSTPVPQRSMAVSQPNLVM 
NHQHQQQMAPSTLSQQNHPTQNPPAGLMSMPNAliTTQQQQQQKL 
RLQRIQMERERIRMRQEELMRQEAALCRQLPMEAETLAPVQAAV 
NPPTMTPDMRSITNNSSDPFLNGGPYHSREQSTDSGLGLGCYSV 
PTTPEDFLSNVDEMDTGENAGQTPMNINPQQTRFPDFLDCLPGT 
NVDLGTLESEDLIPLFNDVESALNKSEPFLTWL 


; 6602 


127 


617 


LLDFPALPKFVLAQSPKAGKPSTMTSMTQSLREVIKAMTKARNF 
E R VLG K I TL VS AAPG KV I CEMKVE EE HTNA I GTLHGGLTATLVD 
NI STMALLCTERGAPGVS VDMNIT YMS PAKLGED I VI TAHVLKQ 
GKTLAFTSVDLTNKATGKLIAQGRHTKHLGN 


| 6603 


79 


660 


P VG P S S LAARTGLGH L P FLHRLAS S RG LDMDLLQ FLAF L F VLLL 
SGMGATGTLRTSLDPSLE I Y KKM FE V KRREQLLAL KNLAQLND I 
HQQYKILDVMLKGLFKVLEDSRTVLTAADVLPDGPFPQDEKLKD 
AFSHWENTAFFGDWLRFPRIVHYYFDHNSNWNLLIRWGISFC 
NQTG VFNQGPHS PILSLM 


*404 


3 


688 


TSTAQRQGGERMSFRGGGRGGFNRGGGGGGFNRGGSSNHFRGGG" 
GGGGGGNFRGGGRGGFGRGGG RGG FNKGQDQG P P E R WLLG E FL 
HPCEDDIVCKCTTDENKVPYFNAPVYLENKEQIGKVDEI FGQLR 
DF YFS VKLSENMKASS FKKLQKFY I DP YKLLPLQR FLPRP PGEK 

GPPRGGGRGGRGGGRGGGGRGGGRGGGFRGGRGGGGGGFRGGRG 
GGFRGRGH 


6605 


7 


848 


SGSRRGAMRAAGVGLVDCHCHLSAPDFDRDLDDVLEKAKKANVV 
ALVAVAEHSGEFEKIMQLSERYNGFVLPCLGVHPVQGLPPEDQR 
SVTLKDLDVALPIIENYKDRLLAIGEVGLDFSPRFAGTGEQKEE 
QRQ VL I RQ I QLAKRLNLP VNVHSRS AG R PT INLLQEQGAEK VLL 
HAFDGRP S VAMEGVRAGYFFS I PPS 1 1 RSGQQKLVKQLPLTS I C 
LETDSPALGPEKQVRNEPWNISISAEYIAQVKGISVEEVIEVTT 
QNALKLFPKLRHLLQK 


6606 


2 


1682 


FVEIRPRAEVANLSAHSASPIQDAVLKRLSLLEDIVYRQLNGLS'" 
KSLGLIEGYGGRGKGGLPATLSPAEEEKAKGPHEKYGYNSYLSE 
KI SLDRS I PDYRPTKCKELKYS KDLPQ I S I I FI FVNEALS VI LR 
SVHSAVNHTPTHLLKEIILVDDNSDEEELKVPLEEYVHKRYPGL 
VKWRNQKREGL I RAR I EG WKVATGQVTGFFDAHVE FTAGWAEP 
VLS R I Q ENR KR V I L PS I DN I KQDN FE VQR YENS AHG Y S WELWCM 
YISPPKDWWDAGDPSLPIRTPAMIGCSFWNRKFFGEIGLLDPG 
MDVYGGENIELGIKVWLCGGSMEVLPCSRVAHIERKKKPYNSNI 
GFYTKRNALRVAEVWMDDYKS HVY I AWNLPLENPG I D IGDVS ER 
RALRKSLKCKNFQWYLDHVYPEMRRYNNTVAYGELRNNKAKDVC 
LDQG PLENHTAI L Y P CHGWG PQLAR YT KEGFLHLGALGTTTLL P 
DTRCL VDNS KSRL PQ LLDCD KVKS S L YKRWN F IQNG A I MNKGTG 
RCL E VKN RG LAG I D L I LRS CTGQRWT I KNS I K 


6607 


137 


9B6 


VPACAGLKKSARSLLASPPRLLNTKLQASCRALFSPPIQSRQTT 
GISFQGRGGAGPGVPTRTQVFAAMGAVMGTFSSLQTKQRRPSKD 
KIEDELEMTMVCHRPEGLEQLEAQTNFTKRELQVLYRGFKNECP 
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SEQ 
| ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine / D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P= Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y-Tyrosine, X«Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








SGVVNEDTFKQIYAQFFPHGDASTYAHYLFNAFDTTQTGSVKFE 
DFVTALSILLRGTVHEKLRWTFNLYDINKDGYINQEEMMDIVKA 
IYDMMGKYTYPVLKEDTPRQHVDVFFQKMDKNKDGIVTLDEFLE 
S CQEDDN I MRS LQLFQNVM 


6608 


224 


1140 


RPCFSSPTGLCPRLSYPMILLQHAVLPPPKQPSPSPPMSVATRS 
TGTLQLPPQKPFGQEASLPLAGEEEIiSKGGEQDCALEELCKPLY 
CKLCNVTLNSAQQAQAHYQGECNHGKKLRNYYAANSCPPPARMSN 
WE P AAT P WP VP PQMG S F KPGGR V I LATEND Y C KLCDAS FS S P 
AVAQAHYQGKNHAKRLRLAEAQSNS FSESSELGQRRARKEGNEF 
KMMPNRRNMYTVQNNSGPYFNPRSRQRIPRDLAMCVTPSGQFYC 
S MCNVGAG EEMEFRQHLES KQH KS KVS EQR YRNE M ENLG YV 


6609 


1 


443 


FRLRCRRFRVAGGRLAGAGLRESRVPAPEQRLSALTLLSWSAVT 
PAAE PGNFQLS PAE P RG P LAS P VRAAP RAPC PAAEMS E LNTKTS 
PATNQ AAGQE E KGKAGNVKKAEE E E E I D I DLTAP ETEKAALAIQ 
GKFRRFQKRKKDPSS 


66X0 


319 


881 


GRKSLCNLHIFIRFPLTYPDMYMGMMCTAKKCGIRFQPPAIILI 
YESEIKGKIRQRIMPVRNFSKFSDCTRAAEQLKNNPRHKSYLEQ 
VSLRQLEKLFSFLRGYLSGQSLAETMEQIQRETTIDPEEDLNKL 
DDKELAKRKSIMDELFEKNQKKKDDPNFVYDIEVEFPQDDQLQS 
CGWDTESADEF 


6611 


978 


212 


PGCSGAGSRVWWLPALRHLAMGSTESSEGRRVS FGVDEEERVRV 
LQG VR L S EN WNRM KE P S S P P PAP TS S TFGLQDGNLRAPH KE S T 
LPRSGSSGGQQPSGMKEGVKRYEQEHAAIQDKLFQVAKREREAA 
TKHSKASLPTGEGSISHEEQKSVRLARELESREAELRRRDTFYK 
SQLERIERKNAEMYKLSSEQFHEAASKMESTIKPRRVEPVCSGL 
Q AQ I LH C YRDR PH EVLLCS D LVKA YQRC VS AAHKG 


6612 


1724 


992 


VSTHASALSRTQGQPQRQPRAAASGAGAGTAGGGGSGGAEGSKM 
STEAQRVDDSPSTSGGSSDGDQRESVQQEPEREQVQPKKKEGKI 
S S KT AAKLS TS A KR I QK E LAE I TLD P P PNCS AG P KGDNI Y E WRS 
TILGPPGSVYEGGVFFLDITFSPDYPFKPPKVTFRTRIYHCNIN 
SQGVI CLD I LKDNWS PALTI S KVLLS I CS LLTDCN PADPLVGS I 
ATQYMTNRAEHDRMARQWTKRYAT 


6613 


130 


748 


ELELSSNMPEQSNDYRVAVFGAGGVGKSSIiVLRFVKGTFRESYI 
PTVEDTYRQVISCDKSICTLQITDTTGSHQFPAMQRLSISKGHA 
FILVYSITSRQSLEELKPIYEQICEIKGDVESIP1MLVGNKCDE 
SPSREVQSSEAKALARTWKCAFMETSAKLNHNVKELFQELLNLE 
KRRTVS LQ I DG KKS KQQKRKE KL KG KC V I M 


6614 


3 


1191 


SSAAEAMRVLVRRCWGPPLAHGARRGRPSPQWRALARLGWEDCR "" 

DSRVREKPPWRVLFFGTDQFAREALRALHAARENKEEELIDKLE 

WTMPSPSPKGLPVKQYAVQSQLPVYEWPDVGSGEYDVGWASF 

GRLLNEALI LKFP YG ILNVHPS CLPR WRGPAPVIHTVLHGDTVT 

GVTIMQIRPKRFDVGPILKQETVPVPPKSTAKELEAVLSRLGAN 

MLISVLKNLPESLSNGRQQPMEGATYAPKISAGTSCIKWEEQTS 

EQ1 FRLYRAIGNI I PLQTLWMANTIKLLDLVEVNSSVIiADPKLT 

GQALIPGSVIYHKQSQILLVYCKDGWIGVRSVMLKKSLTATDFY 

NGYLHPWYQKNSQAQPSQCRFQTLRLPTKKKQKKTVAMQQCIE 


6615 


832 


35 


GRVGAGASAMSELPGDVRAFLREHPSLRLQTDARKVRCILTGHE 
LPCRLPELQVYTRGKKYQRLVRASPAFDYAEFEPHIVPSTKNPH 
QLFCKLTLRHINKCPEHVLRHTQGRRYQRALCKYEECQKQGVEY 
VPACL VHRRRRREDQMDGDG PR PREAF WE P TS S DEGGAASDDSM 
TDLYPPELFTRKDLGSTEDGDGTDDFLTDKEDEKAKPPREKATD 
EGRRETTVYRGLVQKRGKKQLGSLKKKFKSHHRKPKSFSSCKQS 
G 


6616 


347 


1886 


LLPPCQGARPLSSPPHASEDNLFLFWNCILCAFPHPSPQPLQYP 
VWPLLLVITQIPAPRHLRNRPFSFSRGGLDSFSGSLSTPSICRS 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

aniliKy dLlu 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q«Glutamine, R~Arginine, 
S«»Serine, ToThreonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X»Unknown, +=Stop 
Codon t /—possible nucleotide deletion, 
\=possible nucleotide insertion) 








PAWVKMAPWPPKGLVPAVLWGLSLFLNLPGPIWLQPSPPPQSSP 
PPQPHPCHTCRGLVDSFNKGLERTIRDNPGGGNTAWEEENLSKY 
KDSETRLVEVLEGVCSKSDFECHRLLBLSEELVESWWFHKQQEA 
PDJjrUWljLbDSLKIjCC.PAGTFGPSCIjPCPGGTERPCGGYGQCEG 
EGTRGGSGHCDCQAGYGGEACGQCGLGYFEAERNASHLVCSACF 
GPCARCSGPEESNCLQCKKGWALHHLKCVDIDECGTEG^CGAD 
QFCVNTEGSYECRDCAKACLGCMGAGPGRCKKCSPGYQQVGSKC 
LiUv Ue>k.Ei 1 li»vt-PtjH,NKUt-hNl fc.GGYKCICAEGYKQMEGICVKEQ 
IPESAGFFSEMTEDELWLQQMFFGI I 1 CALATLAAKGDLVFTA 
I F IGAVAAMTGY WLS ERSDRVLEGFI KGR 


6617 


118 


673 


VWMAWQVSLLELEDRLQCPICLEVFKBSLMLQCX3HSYCKGCLVS 
LSYHLDTKVRCPMCWQAVDGSSSLPNVSLAWVIEALRLPGDPEP 

WnTHHOMDT.OT Tl/*>W vr\C\X?T TPPT.PPT T PCUriLrunvrnTPTTrrir 
ft. V U V ririKW r'Jj&ij r(,L JvUy £. Jj JL JjQAjljbwSHQHHPVTFISTVCS 

RMKEELAALFSELKQEQKKVDELIAKLVKNRTRIDGSAPSLCPC 
LGPATFTFL 


6618 


54 8 


136 


DGKVARRAPNSPAFQNDIYPLVSAPRATTAESPWSKVLQNTQCR 
NVPKMTSERSRIPCLSAAAAEGTGKKQQEGRAMATLDRKVPSPE 
AFLGKPWSSWIDAAKLHCSDNVDLEEAGKEGGKSREVMRLNKEA 
WKYGT 


6619 


246 


842 


PASSEVLTAAVMFLLLNCIVAVSQNMGIGKNGDLPRPPLRNEFR 
YFQRMTTTSSVEGKQNLVIMGRKTWFSIPEKNRPLKDRINLVLS 
RELKEPPQGAHFLARSLDDALKLTERPELANKVDMIWIVGGSSV 
Y KE AMNHLGH LKL FVTR IMQDFESDTFFSEI DLEKYKLL PE Y P G 
ILSDVQEGKHIKYKFEVCEKDD 


6620 


3 


1879 


NSRVDDFVARARMAAENEASQESALGAYSP VDYMS ITS FPRLPE 
DEPAPAAPLRGRKDEDAFLGDPDTDPDSFLKSARLQRLPSSSSE 
MGSQDGSPLRETRKDPFSAAAAECSCRQDGLTVIVTACLTFATG 
VTVALVMQ I YFGD PQ I FQQGAWTDAAR CTS LG I EVLS KQGSSV 
DAAVAAALCLGIVAPHSSGLGGGGVMLVHDIRRNESHLIDFRES 
APGALREETLQRSWETKPGLLVGVPGMVKGLHEAHQLYGRLPWS 
QVLAFAAAVAQDG FNVTHDLARALAEQLP PNMS ER FRETFLPSG 
RPPLPGSLLHRPDLAEVLDVLGTSGPAAFYAGGNLTLEMVAEAQ 
HAGGVI TE EDFSNYS ALVEKPVCGVYRGHLVLS PPP PHTGPAL I 
SAbNILEGFNLTSLVSREQALHWVAETLKIALALASRLGDPVYD 
STI TESMDDMLSKVEAAYLRGH INDSQAAPAPLLPVYELDGAPT 

WPNRTANHSAPSLENSVQPGKRPLSFLLPTWRPAEGLCGTYLA 
LGANG AARGL S GLTQ VRFTPWLAF FS RE PS CGLD CRCLS YLWLV 
SIPHAANMG 


6621 


1 


662 


VQGITSYQQRLQALRKEKSRDAARSRRGKENFEFYELAKLLPLP 
AAI TSQLDKAS I IRLTI S YLKMRDFANQGDP PWNLRMEGPPPNT 
S V KV I GAQRRR S PS ALA I E VFE AHLGSH I LQS LDG YVFALNQEG 
KFLYISETVSIYLGLSQVELTGSSVFDYVHPGDHVEMAEQLGMK 
LPPGRGLLSQGTAEDGASSASSSSQSETPEPWCFPPASDQFLL 


££22 


2 


"319 


fiDAcnanTrcTPRppDrDRDAMcaMMnvDVPTiPDeT qtvutpmpm 
uKAa la Ay IS c, 1 £, Aljvj P tt KAKAM b AW M t'xU<l^i , vjKbL«KIKVl&MGN 

AE VG KS C 1 1 KR YCE KR F VS KYLAT I G I D YG VTKVHVRDRE I KVN 
I FDMAGHP FF YE VRKP F 


6623 


1886 


189 


KALFEKVKKFRLHVEEGDILYAMYVRQTVLKVI KFLI1 IAYNSA 
LVSKVQFTVDCNVDIQDMTGYKNFSCNHTMAHLFSKLSFCYLCF 
VSIYGLTCLYTLYWLFYRSLRJBYSFEYVRQETGFDD1PDVKNDF 
AFMLHMIDQYDPLYSKRFAVFLSEVSENKLKQLNLNNEWTPDKL 
RQKLQTNAHNRLELPLIMLSGLPDTVFEITELQSLKLEIIKNVM 
IPATIAQLDNLQELSLHQCSVKIHSAALSFLKENLKVLSVKFDD 
MRELPPWMYGLRNLEELYLVGSLSHDISRNVTLESLRDLKSLKI 
LSIKSNVSKIPQAVVDVSSHLQKMCIHNDGTKLVM1.NNLKKMTN 
LTELELVHCDLERIPHAVFSLLSIiQELDLKENXTLKSIEEIVSFQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=*Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\»possible nucleotide insertion) 








HLRKLTVLKLWHNSITYIPEHIKKLTSLERLSFSHNKIEVLPSH 
LFLCNKIRYLDLSYNDIRFIPPEIGVLQSLQYFSITCNKVESLP 
DELYFCKKLKTLKIGKNSLSVLSPKIGNLLFLSYLDGKGNHFEI 
LP PELGD CRALKRAGLWEDALFETLPS DVREQMKTE 


6624 


218 


1786 


GSRRGGGSRIPAVSTHVAPGRSVT.RPPA<;r;AT.RT.'R<3T vifiT nnn 
RGRPSGLAHLSQETSHWRAKRSGRACLGDFPGEILRSFIMKCTA 
REWLRVTTVLFMARAI PAMWPNATLLE KLLE KYMDEDGEWW IA 
KQRGKRAI TDNDMQS I LDLHNKLRSQVYPTASNMEYMTWDVELE 
RS AESWAESCLWEHG PAS LLPS IGQNLGAHWGRYRPPTFHVQSW 
YDEVKDFSYPYEHECNPYCPFRCSGPVCTHYTOWWATQMpTfzr 
A I NL CHNMN I WGQ I W P KAVYL VCN YS P KGNW WGHAP YKHGR P C S 
AC P PS FGGGCRENLCYKEGSDR YYP PREEETNE I ERQQS QVHDT 
HVRTRSDDSSRNEVISAQQMSQIVSCEVRLRDQCKGTTCNRYEC 
PAGCLDS KAKVI GS VH YEMQSS I CRAAI HYGI X DNDGGWVDITR 
QGRKHYFIKSNRNGIQTIGKYQSANSFTVSKVTVQAVTCETTVE 
QL C P FH KPAS H CPR VY C PRKL YAS KS TLCS CNWNS S L F 


6625 


1124 


543 


PGPRGGGGSLLSTKALGRSRGLGMHPGPSSGCjTEGGVPTALRPP 
GPLVPSTSDDNLLKNI ELFDKLALR FHGRLLFLKDVLGDE ICCW 
SFYGQGRKIAEVCCTSIVYATEKKQTKVEFPEARIFEETLNILI 
YE T P RG P D P ALLE ATGGAAGAGG AGRGE DE ENREHR VRR I HVRR 
HITHDERPHGQQIVFKD 


6626 


3 


14 98 


iMaf vi lUKr iiLlLul&VhtLCSLRSDATMESITACLHALQAL 
LDVPWPRSKIGSDQDSGIELLNVLHRVILTRESPSIQLASLEW 
RQ I I CAAQEHVKEKRRSAEVDDGAAEKETLPEFGEGKDTGGLVP 
GKSLVFATLELCVCILVRQLPELNPKLTGSPGVKATKPQILLED 
GS RL VS AAL VI L S EL PAVCS PEG SISILPTILYL TIG VLRETAV 

KLPGGQLS ST VAASLQALKGI LS S PMARABKSRTAWTDLLRS AL 

TT I LDCWDPVDPTHnPTiDP'VCIT .T .T A T TVP T T .QTQ omrw T nrn- r\ 
a x j.uu\+nv c vul j. nyoiiuo voubi/il I VP 1 i_to 1 o ire* V 1 1 X ir 

KRC I DKFKATLE I KDPWQ I KTYQLLHS I FQY PNPAVS YPYIYS 
LAS CIMEKLQEIDKRKP ENTAELE I FQEG I KVLETLVT VAE EHH 
RAQLVACLLPILISFLLDENSLGSATSIMRNLHDFALQNLMQIG 
PQYSSVFKSLVASSPALKARLEAAIKGNQESVKVKIPTSKYTKS 
PGKNSSIQLKTSFL 


6627 


1 


6^7 


G I PHLSSRDMTGTPGAVATRDGEAPERSPPCS PS YDLTGKVMLL 
GDTG VGKTCFL I Q FKDGAFLSGTF I ATVG I DFRNKWTVDGVRV 
KLQ I WDTAGQERFRS VTHAYYRDAQ ALLLL YD I TNKSS FDN I RA 
WLTEIHEYAQRDWIMLLGNKADMSSERVIRSEDGETLAREYGV 
PFLETSAKTGMNVELAFLAIAKELKYRAGHQADEPSFQIRDYVE 
SQKKRSSCCSFM 


6628 


1 


1861 


QCAEFGGGSGGGGGSGGGGSGGGRGAGGEENKENERP SAGS KAN 
KEFGDSLSLEILQIIKESQQQHGLRHGDFQRYRGYCSRRQRRLR 
KTIjNF KM^NRH KFTGKKVTE ELLTDNR YLLL VLMDAERAW S YAM 
yjbi\y£»AW 1 a^Kl^rHLLSKLKKAVKxiAEELER 
LEAQAYTAYLSGMLRFEHQEWKAAIEAFNKCKTIYEKLASAFTE 
EQAVLYNQRVEE I S PNIRYCAYNIGDQSAINELMQMRLRSGGTE 
GLLAEKLEALI TQTRAKQAATMSEVEWRGRTVPVKIDKVRI FLL 
GLADNEAAIVQAESEETKERLFESMLSECRDAIQWREELKPDQ 
KQRD YILEGEPGKVSNLQYLHS YLTYI KLSTAI KRNENMAKGLQ 
RALLQQQPEDDS KRS PRPQDL IRLYDI I LQNLVELLQLPGLEED 
KAFQKEIGLKTLVFKAYRCFFIAQSYVLVKKWSEALVLYDRVLK 
YANE VNSDAGAFKNS LKDLPDVQEL I TQVRSEKCS LQAAAI LDA 
NDAHQTETSSSQVKDNKPLVERFETFCLDPSLVTKQANLVHFPP 
GFQPIPCKPLFFDLALNHVAFPPLEDKLEQKTKSGLTGYIKGIF 
GFRS 


6629 


56*3 


4549 


GATPLGS VGGRTG KMDAATLT YDTLR FAE FEDF PETS E PVW I LG 
RKYSIFTEKDEILSDVASRLWFTYRKNFPAIGGTGPTSDTGWGC 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re spond ing 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A-Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, P= Phenylalanine, G=Glycine, 
H=Histidine, 3>Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *oStop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








MLRCGQMIFAQALVCRHLGRDWRWTQRKRQPDSYFSVIiNAFIDR 
KDSYYSIHQIAQMGVGEGKSIGQWYGPNTVAQVLECKLAVFDTWS 
S LAVH I AMDNT VVMEE I RRLCR TS VPCAGATAFPADS DRHCNG F 
PAGAEVTNRPS PWRPLVLLI PLRLGLTDINBAYVETLKHCFMMP 
QSLGVIGGKPNSAHYFIGYVGEELIYLDPHTTQPAVEPTDGCFI 
PDESFHCQHPPCRMSIAELDPSIAWRGGHLSTQAFGAECCLGM 
TRKTFGFLRFFFSMLG 


6630 


2 


423 


LVQCGGIRRRSAWGAMPGRHVSRVRALYKRVLQLHRVLPPDLKS 
LGDQYVKDEFRRHKTVGSDEAQRFLQEWEVYATALLQQANENRQ 
NSTGKACFGTFLPEEKLNDFRDEQIGQLQELMQEATKPNRQFSI 
SESMKPKF 




2 


423 


L VQCG G I RRR S AWGAMPGRHVSR VRAL YKR VLQLHR VL P PDLKS 
LGDQYVKDEFRRHKTVGSDEAQRFLQEWEVYATALLQQANENRQ 
NS TG KACFGT F L P E EKLN D FRDEQ I GQLQE LMQE ATKPNRQ F S I 
SESMKPKF 


6632 


1273 


588 


WNSRGRTQRGAAPLAPAAAMKAWQRVTRASVTVGGEQlSAIGR 
GICVLLGISLEDTQKELEHMVRKILNLRVFEDESGKHWSKSVMD 
KQYEI LCVSQFTLQCVLKGNKPDFHLAMPTEQAEGFYNS FLEQL 
RKTYRPELIKDGKFGAYMQVHIQNDGPVTIELESPAPGTATSDP 
KQLSKLEKQQQRKEKTRAKGPSESSKERNTPRKEDRSASSGAEG 
DVSSEREP 


6633 


1145 


617 


ATGRHEGVPTLEGIIQQLVNGIITPATIPSLGPWGVLHSNPMDY 
AWGANG LDAI I TQLLNQ FENTGP P PAD KEK I QALPTVP VTE E HV 
GSGLECPVCKDDYALGERVRQLPCNHLFHDGCIVPWLEQHDSCP 
VCRKSLTGQNTATNPPGLTGVSFSSSSSSSSSSSPSNENATSNS 


6634 


1 


1134 


CGGIPRKGSGPRRRLPMARLRDCLPRLMLTLRSLLFWSLVYCYC 
GLCASIHLLKLLWSLGKGPAQTFRRPAREHPPACLSDPSLGTHC 
YVRIKDSGLRFHYVAAGERGKPLMLHiHGFPEFWYSWRYQLREF 
KS E YR WALDLRG YGETDAP I HRQN YKLDCLI TD I KDILDS LG Y 
SKCVL I GHDWGGM I AWL IAI CYPEM VMKLI VINFPHPNVFTE YI 
LRHPAQLLKSSYYYFFQIPWFPEFMFSINDFKVLKHLFTSHSTG 
IGRKGCQLTTEDLEAYIYVFSQPGALSGPINHYRNIFSCLPLKH 
HMVTTPTLIJjWGENDAFMEVEMAEVTRFYVKNYFRLTILSEASH 
WLQQDQPDIVNKLIWTFLKEETRKKD 


6635 • 


1420 


470 


EMRAGQQLASMLRWTRAWRLPREGLGPHGPSFARVPVAPSSSSG 
GRGGAEPRPLPLSYRLLDGEAALPAWFIjHGLFGSKTNFNSIAK 
ILAQQTGRRVLTVDARNHGDSPHSPDMSYEIMSQDLQDLLPQLG 
LVPCVVVGHSMGGKTAMLIiALQRPELVERLIAVDISPVESTGVS 
HFATYVAAMRAINIADELPRSRARKLADEQLSSVIQDMAVRQHL 

ltnlvevdgrfvwrvnldaltqhldkilafpqrqesylgptlfl 
lggnsqfvhpshhpeimrlfpraqmqtvpnaghwihadrpqdfi 
aairgflv 


6636 


1514 


1801 i 


SFCMFSHKQDSHFQAVPVQEKKKRLRRAPWRAFAQPQRLKHPAE 
QPIVRQCLQRPPLCGVLGPVQQQLPPSLGPVLSPHSDPGWCRVD 
DGGDGVF 


6637 


2 


1501 


CSSSPCFHDGTCVLDKAGSYKCACLAGYTGQRCENLLEAGKSKI 
KASEDSLSVLEERNCSDPGGPVNGYQKITGGPGLINGRHAKIGT 
WSFFCWNSYVLSGNEKRTCQQNGEWSGKQPICIKACREPKISD 
L VRRR VLP MQ VQ S RET P LHQL Y S AAFS KQKLQS AP TKKP AL P FG 
DLPMGYQHLHTQLQYECISPFYRRLGSSRRTCLRTGKWSGRAPS 
CI P ICGKI ENITAP KTQGLRWPWQAAI YRRTSGVHDGSLHKGAW 
FLVCSGALVNERTWVAAHCVTDLGKVTMIKTADLKWIjGKFYR 
DDDRDEKTIQSLQISAIILHPNYDPILLDADIAILKLLDKARIS 
TRVQPICLAASRDLSTSFQESHITVAGWNVLADVRSPGFKNDTL 
RSG WS WDSLLCEEQHEDHG I PVS VTDNMFCASWE PTAPSD I C 
TAETGG I AAVS FPGRAS PEPRWHLMGLVS WS YDKTCSHRLSTAF 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 

Amino api ^ 

CI 111 X I l<w> d «— -L \JL 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 


Amino acid segment containing signal peptide 
{A»Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F-Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S"=Serine, T=Threonine, V=Valine, 
W= Tryptophan, Y=Tyrosine, X«Unknown, *»Stop 

LUUUU/ / o pUoolUJ.c UULlLULlUt UciCliUU, 

\=possible nucleotide insertion) 








TKVLPFKDWIERNMK 


6638 


1391 


224 


GGIPQAGGKMAAPWWRAALCECRRWRGFSTSAVLGRRTPPLGPM 
PNSD I DLSNL ERL E KYR S FDR YRRRAEQEAQAPHW WRT YRE YFG 
EKTDPKEKIDIGLPPPKVSRTQQLLERKQAIQELRANVEEERAA 

PT.P TAQVPT.nAVPAPWPRTCRPVWTfnPr.&FYYnT.YBriT.FWfiiATI? 

VPRVPLHVAYAVGEDDLMPVYCGNEVTPTEAAQAPEVTYEAEEG 
SLWTLLIiTSLDGHLLEPDAEYLHWLLTNIPGNRVAEGQVTCPYL 
PPFPARGSGIHRLAFLLFKQDQPIDFSEDARPSPCYQLAQRTFR 
TFDFYKKHQETMTPAGLSFFQCRWDDSVTYIFHQLLDMREPVFE 
FVRPPPYHPKQKRFPHRQPLRYLDRYRDSHEPTYGIY 


6639 


2046 


1268 


IGCFIMDGGDDGNLIIKKRFVSEAELDERRKRRQEEWEKVRKPB 

T1PF EC PRRVYfl PP <3 T iVRR T .OPfi VTiTt IfOOPYP PO PTf PTTMM^/P f3T .n 

EDETNFLDEVSRQQELI EKQRREEELKELKE YRNNLKKVG I SQE 
NKKEVEKKLTVKPIETKNKFSQAKL1AGAVKHKSSESGNSVKRL 
KPDPEPDDKNQEPSSCKSLGNTSLSGPSIHCPSAAVCIGILPGL 
GAYSGSSDSESSSDSEGTINATGKIVSSIFRTNTFLEAP 


6640 


117 


1043 


VLEPPDVSMAESEDRSLRIVLVGKTGSGKSATANTILGEEIFDS 
RIT^AQAVTKNCQKASREWQGRDLLWDTPGLFDTKESLDTTCKE 

KHMVILFTRKEELEGQSFHDFIADADVGLKSIVKECGNRCCAFS 
NSKKTSKAEKESQVQELVELIEKMVQCNEGAYFSDDIYKDTEER 
LKQREEVLRKIYTDQLNEEIKLVEEDKHKSEEKKEKEIKLLKLK 
YDEKIKNIREEAERNIFKDVFNRIWKMLSEIWHRFLSKCKFYSS 


6641 


1 


894 


SAAVGRRSEVRGCAPRPRLRRSARRMDPVPGTDSAPLAGLAWSS 
ASA P P P RGF S A I S CTVEG AP AS FG KS FAQKS G Y FLCLS S LG S LE 
NPQENW7U) IQI WDKS PLPLGFSPVCDPMDS KASVSKKKRMCV 
KLLPLGATDTAVFDVRLSGKTKTVPGYLRIGDMGGFAIWCKKAK 
APRPVPKPRGLSRDMQGLSLDAASQPSKGGLLERTASRLGSRAS 
TLRRNDS I YEASSLYGI SAMDGVP FTLHPRFEGKSCS PLAFSAF 
GDLTI KSLAD I EEE YN YGFWEKTAAARLPPS VS 


£642 


22 


1296 


PLEERMMTKMDPNDQAQRDI I FELRRIAFDAESDPSNAPGrSCaTE 
KRKAMYTKDYKMLGFTNHINPAMDFTQTPPGMLALDNMLYLAKV 
HQDTYIRIVLENSSREDKHECPFGRSAIELTKMLCEILQVGELP 
NEGRND YHPMFFTHDRAFEE LFGI C I QLLNKTWKEMRATAEDFN 

KVMnWRPnTTPAT.pc?VPNQT.rinPWQVT.pqT.<3VC!T?TT PT.OOQK'T? 

MSQDDFQSPPIVELREKIQPEILELIKQQRLNRLCEGSSFRKIG 
NRRRQERFWYCRLALNHKVLHYGDLDDNPQGEVTFESLQEKIPV 
ADIKAI VTGKDCPHMKEKSALKQNKEVLELAFS I LYDPDETLNF 
I APNK Y E YC I W I DGLS ALLGKDMS S E LT KSDLDTLLS ME M KLRL 
LDLENIQIPEAPPPIPKEPSSYDFVYHYG 


6643 


3049 


2265 


SLHAPAEGRTRGRLAEKPKMLTRKIKLWDINAHITCRLCSGYLI 
DATT VT E CLHTF C£ S CLVK YLEENNT C P TCR I V I HQS H PLQ Y I G 
HDRTMQDI V YKLVPGLQEAEMRKQRE F YHKLGME VPGD I KGETC 
SAKQHLDSHRNGETKADDSSNKEAAEEKPEEDNDYHRSDEQVSI 
CLE CNS S KLRGL KRKW IRC S AO ATVLHL K KF T ATC KT .WT . Q PN PT . 
DILCNEEILGKDHTLKFVWTRWRFKKAPLLLHYRPKMDLL 


6644 


1489 


290 


FRPLATEPRGSS PVQLVSSTMSVRTLPLLFLNLGGEMLY I LDQR 
LRAQNIPGDKARKVLNDIISTMFNRKFMEELFKPQELYSKKALR 
TV YE RLAHAS I M KLNQ AS MDKL YDLMTMAFKYQVLLCPR PKD VL 
LVT FNHLDT I KG F IRDS PT I LQQVDETLRQLTE I YGGLS AGE FQ 
LIRQTLLIFFQDLHIRVSMFLKDKVQNNNGRFVLPVSGPVPWGT 
EVPGLIRMFNNKGEEVKRIEFKHGGNYVPAPKEGSFEFYGDRVL 
KLGTNMYSWQPVETHVSGSSKNLASWTQESIAPNPLAKEELNF 
LARLMGGME I KKPSG PEPG FRLNLFTTDEEEEQAALTRPE ELS Y 
EVINIQATQDQQRSEELARIMGEFEITEQPRLSTSKGDDLLAMM 
DEL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Un)cnown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6645 


6530 


4646 


FVEGLAGYVYKAASEGKVLTLAALLLNRSESDIRYLLGYVSQQG 
GQRS TPL 1 1 AAJRNGHAKWRLLLEH YRVQTQQTGTVRFDGYV I D 
GATALWCAAGAGHFEVVKLLVSHGANVNHTTVTNSTPLRAACFD 
GRLDIVKYLVENNANISIANKYDNTCLMIAAYKGHTDWRYLLE 
QRADPNAKAHCGATALHFAAEAGHIDIVKELIKWRAAIVVNGHG 
MTPLKVAAESCKADWELLLSHADCDRRSRIEALELLGASFAND 
REN YD 1 1 KTYH YLYLAMLERFQDGDN ILE KE VLPP I HAYGNRTE 
CRNPQELES I RQDRDALHMEGLI VRERILGADNIDVSHP IIYRG 
AVYADNMEFEQCIKLWLHALHLRQKGNRNTHKDLLRFAQVFSQM 
I HLNETVKAPD IECVLRCS VLE I EQSMNRVKN I SDADVHNAMDN 
YECNLYTFLYLVCISTKTQCSEEDQCKINKQIYNLIHLDPRTRE 
G FTLLHLAVNSNTP VDDFHTNDVCS FPNALVTKLLLDCGAEVNA 
VDNEGNSALHI IVQYNRPISDFLTLHSII ISLVEAGAHTDMTNK 
QNKT P LDKS TTGVS E I LLKTQMKMS LKCLAARAVRAND INYQDQ 
IPRTLEEFVGFH 


6646 


176 


890 


PSSRIWHLPEDMENALTGSQSSHASLRNIHSINPTQLMARIESY 
EGREKKGISDVRRTFCLFVTFDLLFVTLLWIIELNWGGIENTL 
EKEVMQYDYYSSYFDIFLLAVFRFKVLILAYAVCRLRHWWAIAL 
TTAVTS AFLIAKVILS KLFSQGAFGYVLP 1 1 S F I LAW I ETWFLD 
FKVLPQEAEEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GSEEAEEKQDSE KPLLEL 


~~6647 


176 


890 


PSSRMNHLPEDMENALTGSQSSHASLRNIHS INPTQLMARIES Y 
EGRE KKG I S D VRRT FCL FVTFDLL FVTLLW 1 1 ELNVNGGI ENTL 
EKE VMQYD Y YS S YFDI FLLAVFR FKVL ILAYAVCRLRHWWAI AL 
TTAVTSAFLLAKVILSKLFSQGAFGYVLPIISFILAWIETWFLD 
FKVLPQEAEEENRLLIVQDASERAALIPGGLSDGQFYSPPESEA 
GSEEAEEKQDSE KPLLEL 


6648 


413 


897 


RNCWNCFTKYFNSPPEDIDHKDSYLITRSIMAEPDYIEDDNPEL 
IRPQKLINPVKTSRNHQDLHRELLMNQKRGLAPQNKPBLQKVME 
KRKRDQVI KQKEEEAQKKKSDLE I ELLKRQQKLEQLELEKQKLQ 
EEQENAPEFVKVKGNLRRTGQEVAQAQES 


6649 


1357 


832 


W I PRAAGIRHE VKWDVKE IMSQHN I YVDALLKEFEQFNRRLNEV 
SKRVRIPLPVSNILWEHCIRLANRTIVEGYANVKKCSNEGRALM 
QLDFQQFLMKLEKLTDIRPIPDKEFVETYIKAYYLTENDMERWI 
KEHREYSTKQLTNLVNVCLGSHINKKARQKLLAAIDDIDRPKR 


6650 


32 


765 


LVPLVFSLLVQSCKQVYRS I AMKFVPCLLLVTLS CLGTLGQAPR 
QKQGSTGEEFHFQTGGRDSCTMRPSSLGQGAGEVWLRVDCRNTD 
QTYWCEYRGQPSMCQAFAADPKSYWNQALQELRRLHHACQGAPV 
LRPSVCREAGPQAHMQQVTSSLKGSPEPNQQPEAGTPSLRPKAT 
VKLTEATQLGKDS MEE LG KAKPTTR P TAKPTQ PG PR PGGNE E AK 
KKAWEHCWKP FQALCAFL I S FFRG 


6*51 


3425 


1353 


AKELLKVGDFSLCAGP YQNTADTMENLS KEPLAS FVSES FD I SA 
CGIATEHVKIDNSGEGLTAEAGSETLSRDGEVGVNSDMHYELSG 
DSDLDLLGDCRNPRLDLEDSYTLRGSYTRKKDVPTDGYESSLNF 
HNNNQEDWGCSSt^VPGMETSLPPGHWTAAVKKEEKCVPPYVQIR 
DLHGILRTYANFSITKELKDTMRTSHGLRRHPSFSANCGLPSSW 
TSTWQVADDLTQNTLDLEYLRFAHKLKQTIKNGDSQHSASSANV 
FPKESPTQISIGAFPSTKISEAPFLHPAPRSRS PLLVTWESDP 
RPQGQPRRGYTASSLDSSSSWRERCSHNRDLRNSQRNHTVSFHL 
NKLKYNSTVKESRNDISLILNEYAEFNKVMKNSNQFIFQDKELN 
DVSGEATAQEMYLPFPGRSASYEDII IDVCTNLHVKLRSWKEA 
CKSTFLFYLVETEDKSFFVRTKNLLRKGGHTEIEPQHFCQAFHR 
ENDTLIIIIRNEDISSHLHQIPSLLKLKHFPSVIFAGVDSPGDV 
LDHTYQELFRAGGFVISDDKILEAVTLVQLKEIIKILEKLNGNG 
RWKWLLHYRENKKLKEDERVDSTAHKKNIMLKSFQSANIIELLH 
YHQCDSRSSTKAEILKCLLNLQIQHIDARFAVLLTDKPTIPREV 
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ID 
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Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G^Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T«Threonine, V^Valine, 
W=Tryptophan, Y^Tyrosine, X=Unknovm, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








FENNG I L VTDVNNF I EN T F K T A A P PR eft; vh 


6652 


2 


1343 


IPGSTISCSCHSRRLRGGSPAPRLSLGAASPRPRPPSLPLPLPL 
PFPLFLPTRPAERAWIRSRRASEWVGKMEVPRLDHALNSPTSPC 
EEVIKNLSLEAIQLCDRDGNKSQDSGIAEMEELPVPHNIKISNI 
TCDSFKISWEMDSKS KDR I TH YF I DLNKKENKNSNKFKHKD VP T 
KLVAKAVPLPMTVRGHWFLSPRTEYTVAVQTASKQVDGDYWSE 
WSEIIEFCTADYSKVHLTQLLEKAEVIAGRMLKFSVFYRNQHKE 
YFDYVREHHGNAMQPSVKDNSGSHGSPISGKLEGIFFSCSTEFN 
TGKPPQDSPYGRYRFEIAAEKLFNPNTNLYFGDFYCMYTAYHYV 
ILVIAPVGSPGDEFCKQRLPQLNSKDNKPLTCTEEDGVLVYHHA 
QDVILEVIYTDPVDLSLGTVAEITGHQLMSLSTANAKKDPSCKT 
CNISVGR 


6653 


170 


1910 


FFLEPRLRPFPASRARFVPARTRPSPLHPCCFCFEGGGSMLSPQ 
R VAAAAS RG ADD AM ESS KPG P VQ WL VQKDQH S FE LDEKALAS I 
liij\J Un x HD L»D v WV S VAG AFR KGKS F I LD FM LR YL Y S Q KE S GHS 
NWLGDPEEPLTGFSWRGGSDPETTGIQIWSEVFTVEKPGGKKVA 
WLMDTQGAFDSQSTVKDCATIFALSTKTSSVQIYNLSQNIQED 
DLQQLQLFTEYGRLAMDEIFQKPFQTLKPLVRDWSFPYEYSYGL 
QGGMAFLDKRLQVKEHQHEEIQNVRNHIHSCFSDVTCFLLPHPG 
LQVATSPDFDGKLKDIAGEFKEQLQALIPYVLNPSKLMEKEING 
oj\v ± v-i\vjjjuej i r tSyfti J..M iyoi3UJjrnrx\oMIjQATAEAYNLAAAA 
SAKDIYYNNMEEVCGGEKPYLSPDILEEKHCEFFCQLALDHFKKT 
KKMGGKDFSFRYQQELEEEIKELYENFCKHNGSKNVFSTFRTPA 
VLFTGIVALYIAQfiT.TGFTRT.FVVAnT.FNnivmriT t t tutt Twnv 

IRYSGQYRELGGAIDFGAAYVLEQASSHIGNSTQATVRDAWGR 
PSMDKKAQ 


6654 


1 


705 


RTS LS PSQCS S FNLxAMAS AGMQ I LG WLTLLGW VNGLVS CALPM 
WKVTAFIGNSIWAQWWEGLWMSCWQSTGQMQCKVYDSLIiAL 
PQDLQAARALC V I ALLVALFGLLVYLAGAKCTTCVEE KDS KARL 
VLTSG I VF VI SGVLTL I PVCWTAHAVI RDF YNPLVAEAQ KRELG 
ASLYLGWAASGLLLLGGGLLCCTCPSGGSQGPSHYMARYSTSAP 
AISRGPSEYPTKNYV 


6655 


341 


16 


KDAYMFKKGLLALALVFSLPVFAAEHW I DVRVPEQYQQEHVQGA 
x ln x v J-i v jvtsit x A i A V r DaxnDTVKV Y CNAGRQ SGQAKE I LS EMG 
YTHVENAGGLKDIAMPKVKG 


6656 


2 


1212 


TELPPRPANLAIQPPLSPLRALAPLPEKPGAVP.PPQKR^4AKVAK 
DLNPGVKKMSLGQLQSARGVACLGCKGTCSGFEPHSWRKICKSC 
KCSQEDHCLTSDLEDDRKIGRLLMDSKYSTLTARVKGGDGIRIY 
KRNRM I MTN P I ATG KD PTFDT I TYE WAP PGVTQ KLGLQ YME L I P 
KEKQPVTGTEGAFYRRRQLMHQLPIYDQDPSRCRGLLENELKLM 
EE FVKQ Y KS E ALG VGE VAL PGQGGL P KE EGKQQE KPEGAETTAA 
TTNGSLSDPSKEVEYVCEIiCKGAAPPDSPWYSDRAGYNKQWHP 
TCFVCAKCS E PL VDL I Y FWKDGAPWCGRH YCES LR PR CSGCDE I 
1 r AJbL) i UK VliDx4AWHRKHFVCEGCEQLLSGRAY I VTKGQLLCPT 
CSKSKRS 


6657 


830 


2126 


LLTCQERAGDCLLSASTMKEWYWSPKKVADWLLENAMPEYCEP 
LEHFTGQDLINLTQEDFKKPPLCRVSSDNGQRLLDMIETLKMEH 
HLEAHKNGHANGHLNIGVDIPTPDGSFSIKIKPNGMPNGYRKEM 
IKIPMPELERSQYPMEWGKTFLAFLYALSCFVLTTVMISVVHER 
VPPKEVQPPLPDTFFDHFNRVQWAFS I CE INGM I L VGLWLI QWL 
LLKYKS I ISRRFFCI VGTL YL YRC ITM YVTTLP VPGMH FNCS P K 
LFGDWEAQLRRIMKLIAGGGLSITGSHNMCGDYLYSGHTVMLTL ■ 
T YLF I KEYS PRRLW W YHW I C WLLS WG I FC I LLAHDH YTVDVW 
AYYITTRLFWWYHTMANQQVLKEASQMNLLARVWWYRPFQYFEK 
NVQGIVPRSYHWPFPWPWHLSRQVKYSRLVNDT 


6658 


35 


855 | HCCALGAPGSPYRGLYFSSAAPCTAPRKAKHQSTLEGLTKRMLM 
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NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D«=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W^Tryptophan, Y=Tyrosine, X«Unknown, *»Stop 
Codon, /=possible nucleotide deletion, 
\-possible nucleotide insertion) 








FDPVPVKQEAMDPVSVSYPSNYMESMKPNKYGVIYSTPLPEKFF 
QTPEGLSHGIQMEPVDLTVNKRSSPPSAGNSPSSLKFPSSHRRA 
S PGLSMPSS S PP I KKYS PPS PGVQPFGVPLSMP P VMAAAL S RHG 
IRSPGILPVIQPVWQPVPFMYTSHLQQPLMVSLSEEMENSSSS 
MQVPVIESYEKPISQKKIKIEPGIEPQRTDYYPEEMSPPLMNSV 
SPPQALLQE 


6659 


18 


523 


EPQRGDCETWFQNCSLPKFVCFFCWGFWLWRAHSMSNLHSLPGL "" 
RGLTS I SRNQLQ CTNAMRVINNYQRR WKNQNTFLLATFANWNV 
CGNPTITCPHNRTIiNNCHHSGVQVPLMYCNLTTPSPQNISNCRY 
AQTPANM FY I VACDNRDQRRDPPQYPW PVHLHT 1 1 


6660 


514 


1707 


CAAS LD CRHH LC E PD M KL VW PS AKLL Q AAAG AS ARACD S VT SNV 
LPLLLEQFHKHSQSSQRRTILEMLLGFLKLQQKWSYEDKDQRPL 
NGFKDQLCSLVFMALTDPSTQLQLVGIRTLTVLGAQPDLLSYED 
LELAVGHLYRLSFLKEDSQSCRVAALEASGTLAALYPVAFSSHL 
VPKLAEELRVGESNLTNGDEPTQCSRKLCCLQALSAVSTHPSIV 
KE TLPLLLQHLWQ VNRGNMVAQS S D V I A V CQS LRQMAE KCQQD P 
ES C W YFHQTAI PCLLALAVQASM P EKE PS VLRKVLLEDE VLAAM 
VSVIGTATTHLSPELAAQSVTHIVPLFLDGNVSFLPENSFPSRF 
QPFQEX3SSGQRRL1ALLMAFVCSLPRNVSEHIWEVLLFNLDKVT 
PG 


6661 


179 


430 


GVHAASGTLSATWLAEAKMFDSLAKAGKYLGQAAKLMIGMPDYD 
NYVEHMRVNHPDQTPMTYEEFFRERQDARYGGKGGARCC 


6662 


185 


423 


RSLP KPAPAQ PAS IHCARFSGVTP PTAKTAMSDGNTAFNALM YC 
GPKADDGNI FSACAPASSAVKAS VS VAQPGQAVI P 


6663 


3 


1005 


R P VL S S R VDD F VP P LP E TSGRR KKL ERM YS VDRVSDD I P I RT W F 
P KENLFS FQ TAS TTMQA I SNFR KHL RMVGS RR VKAQT FAE RRER 
SFSRSWSDPTPMKADTSHDSRDSSDLQSSHCTLDEAFEDLDWDT 
EKGLEAVACDTEGFVPPKVMLISS KVPKAEY I PTI IRRDDPS I I 
PILYDHEHATFEDILEEIERKLNVYHKGAKIWKMLIFCQGGPGH 
LYIiLKNKVATFAKVEKEEDMIHFWKRLSRLMSKVNPEPNVIHIM 
GCYILGNPNGEKLFQNLRTLMTPYRVTFESPLELSAQGKQMIET 
YFDFRLYRLWKSRQHSKLLDFDDVL 


6664 


58 


968 


PRLLRLPRSVWMDSPWDELALAFSRTSMFPFFDIAHYLVSVMA 
VKRQPGAAALAWKNPISSWFTAMLHCFGGGILSCLIiLAEPPLKF 
LANHTNILLAS SIWYI TFFCPHDLVSQG YS YLPVQLLASGMKEV 
TRTWKIVGGVTHANSYYKNGWIVMIAIGWARGAGGTIITNFERL 
VKGDWKPEGDEWLKMSYPAKVTLLGSVIFTFQHTQHLAISKHNL 
MFLYTIFIVATKITMMTTQTSTMTFAPFEDTLSWMLFGWQQPFS 
SCEKKSEAKSPSNGVGSLASKPVDVASDNVKKKHTKKNE 


6665 


171 


1278 


DERRLACRQWTQQRSELYPGFQKRQRFLPKAGEEAAAQGGRHL 
PGRWLGPGCTQNPCSVHTATGPEPRKLPLLPPDSPNSGYPKEPA 
ALCPG I PS PCRMTHQDLS I TAKL INGGVAGLVGVTCVFP I DLAK 
TRLQNQHGKAMYKGMIDCLMKTARAEGFFGMYRGAAVNLTLVTP 
EKAIKLAANDFFRRLLMEDGMQRNLKMEMLAGCGAGMCQVWTC 
PMEMLKIQLQDAGRLAVHHQGSASAPSTSRSYTTGSASTHRRPS 
ATL I AWELLRTQGLAGL YRGLGATLLRDIPFS I I YFPLFANLNN 
LGFNELAGKASFAHSFVSGCVAGSIAAVAVTPLDVLKTRIQTLK 
KGLGEDMYSGITDCAR 


6666 


498 


2868 


MTTFLPVPQMMAGFSFGTFGNPPMESPSAWQTIHQPFIVSCLTL 
WS PG CW PQPIQ KEGVGLWD I R KPQS S LLRYGGNLS LiQSAMS VRF 
NSNGTQLLALRRRLPPVLYDIHSRLPVFQFDNQVYFNSCTMKSC 
CFAGDRDQYILSGSDDFNLYMWRIPADPEAGGIGRWNGAFMVL 
KGHRS I VNQVRFNPHTYMICSSGVEKI I KIWSPYKQPGCTGDLD 
GRIEDDSRCLYTHEEYISLVLNSGSGLSHDYANQSVQEDPRMMA 
FFDSLVRREIEGWSSDSDSDLSESTILQLHAGVSERSGYTDSES 
SASLPRSPPPTVDESADNAFHLGPLRVTTTNTVASTPPTPTCED 
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ID 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
Sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=»Phenylalanine, G=Glycine, 
H^Histidine, I=Isoleucine / K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q^Glutamine, R^Arginine, 
S=Serine, T=Threonine , v=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=poesible nucleotide insertion) 








AASRQQRLSALRRYQDKRLLALSNESDSEENVCEVELDTDLFPR 
PRSPSPEDESSSSSSSSSSEDEEELNERRASTWQRNAMRRRQKT 
TREDKPSAPIKPTNTYIGEDNYDYPQIKVDDLSSSPTSSPERST 
STLEIQPSRASPTSDIESVERKIYKAYKWLRYSYISYSNNKDGE 
TS LVTGE ADEGRAGT S H KDN PA P S S S KEACLN I AMAQRNQDLP P 
EGCSKDTFKEETPRTPSNGPGHEHSSHAWAEVPEGTSQDTGNSG 
SVEHPFETKKLNGKALSSRAEEPPSPPVPKASGSTLNSGSGNCP 
RTQSDDSEERSLETICANHNNGRLHPRPPHPHNNGQNLGELEW 
AYSSPGHSDTDRDNSSLTGTLLHKDCCGSEMACETPNAGTREDP 
TDTPATDSSRAVHGHSGLKRQRIELEDTDSENSSSEKKLKT 


bob/ 


171 


1310 


AEEVERLAAMRSDSLVPGTHTPPIRRRSKFANLGRIFKPWKWRK 
KKSEKFKHTSAALERKISMRQSREELIKRGVLKEIYDKDGELSI 
SNEEDSLENGQSLSSSQLSLPALSEMEPVPMPRDPCSYEVLQPS 
DIMDGPDPGAPVKLPCLPVKLSPPLPPKKVMICMPVGGPDLSIjV 
SYTAQKSGQQGVAQHHHTVLPSQIQHQLQYGSHGQHLPSTTGSL 
P M H P S G CRM I DELNKTLAMTMQ RLES S EQR VP CS TS YHS S GLHS 
GDGVTKAGPMGLPEIRQVPTWIECDDNKENVPHESDYEDSSCL 
YTREEEEEEEDEDDDSSLYTSSLAMKVCRKDSLAIKPSNRPSKR 
ELEEKNILPRQTDEERLELRQQIGTKL 


| 6668 


714 


358 


TIaAVATGPALTLRCHVCTS s snckhs wcpas s r fckttntvep 
LRGNLVKKDCAESCTPSYTLQGQVSSGTSSTQCCQEDLCNEKLH 
NAAPTRTALAHSALSLGLALS LLAVI LAPSL 


66'6 , 9 


459 


1207 


KDEE TRKDYD YM LDHPE E Y YS H Y YH Y YS RRLAP KVD VR W I L VS 
VCAISVFQFFSWWNSYNKAISYLATVPKYRIQATEIAKQQGLLK 
KAKEKGKNKKSKEEIRDEEENIIKNIIKSKIDIKGGYQKPQICD 
LLLFQIILAPFHLCSYIVWYCRWIYNFNIKGKEYGEEERLYIIR 
KSMKMSKSQFDSLEDHQKETFLKRELWIKENYEVYKQEQEEELK 
KKLANDPRWKRYRRWMKNEG PGRLTFVDD 


6670 


184 


594 


VARI'GEAAKMSSEPPPPYPGGPTAPLLEEKSGAPPTPGRSSPA 
VMQPPPGMPLPPADIGPPPYEPPGHPMPQPGFIPPHMSADGTYM - 

PPGFYPPPGPHPPMGYYPPGPYTPGPYPGPGGHTATVLVPSGAA 
TTVTV 


DDfl 


1 


763 


LPAE KPRS APNMAGGRCG PQLTALLAAWI AAVAATAGPEEAAL P 
PEQSRVQPMTASNWTLVMEGEWMLKFYAPWCPSCQQTDSEWEAF 
AKNGEILQISVGKVDVIQEPGLSGRFFVTTLPAFFHAKDGIFRR 
YRGPG I FEDLQNY I LEKKWQS VEPLTGWKS PASLTMSGMAGLFS 
ISGKIWHLHNYFTVTLGIPAWCSYVFFVIATLVFGLSMDLVL*V 
I SQCNWDP P YRHVS * / RPSTNLGVHTAHTSEHLRL 


6672 


304 


1069 


APGSKPVQFMDFEGKTSFGMSVFNLSNAIMGSGILGLAYAMAHT 
G V I FFLALLLCIALLS S YS I HLLLTCAG LAG I RAYEQLGQRAFG 
PAGKVVVATVICLHNVGAMSSYLFIIKSELPLVIGTFLYMDPEG 
DWFLKGNLLIIIVSVLIILPLALMKHLGYLGYTSGLSLTCMLFF 
LVSVIYKKFQLGLCYRATMKQQWESEALVGTPQPRDSTAAVKAQ 
rivtiii Li i\jVJj iy WPXMAFAFVCHPGGAGPSITELCRAFQAQD 


6673 


1116 


1963 


LQIQTHHTHHGARVTHLGSHQLLANAGTMLCRQQSSSMAPAFSQ 
SVTCGPSPCVRKQESATKCLHIGACGSDLWARGWEQG+G*GLNV 
W LC PCVAFHRGAR PQAE EGGARWNS LVS SPWIPPNP*HSSI GAE 
NAVPRP*QG*KVNPSGQERQS\WVLPLPVPGEPLKLPGLPG*NK 
SFSRV/SGSKGKWILPRQLM*AS*R\TPRFVPGTQWVPITW/PL 
ITWH*SAPTPPLKACPAPRPSDPCSSCLSCPCVTQHPRFSDTGW 
FG AGHCH S S CD F TR KGAAGGPG 


6674 


1 


440 


LE FD YMCQYD YVE VRbGDNRDGQ 1 1 KRVCGNERPAP I QS I GSS L 
HVLFHSDGSKNFDGFHAIYEEITACSSSPCFHDGTCVLDKAGSY 
KCACLAGYTGQRCENLLEERNCSDPG/WPSQWVPENNRGPWAYQ 
PTPC* IGTRVAFFLT 
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amino acid 
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amino acid 
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Predicted end 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine r C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, ^Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /«possible nucleotide deletion, 
\**possible nucleotide insertion) 


6675 


277 


1678 


GNWPTERMAFLDNPTIIIiAHIRQSHVTSDDTGMCEMVLIDHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQI KCKNIQWKERNS KQSAQELKSLFE 
KKSLKEKPPISGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTVVTMASARVQDLIGLICWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
GFSTLALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
HYKSFKVSMIHRLRFTTDVQL/GCALFPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS*DSHKC*EGISGDKVEIDPVTNQ 
KASTKFWIKQKPISIDSDLLCAC\DLAEE 


6676 


277 


1678 


GNWPTERMAFLDNPTIILAHIRQSHVTSDDTGMCEMVLIDHDVD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQI KCKNIQWKERNS KQSAQELKSLFE 
KKSLKEKPPISGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTVVTMASARVQDLIGLICWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
GFSTLAtiVEKYSSPGLTSKESLFVRINAAHGFSHQVDNTKVTM 
KEILLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
HYKSFKVSMIHRLRFTTDVQL/GCALFPGVLRKRAAPVDCLRPS 
ADTWRQEQIGCCGAACAALRS*DSHKC+EGISGDKVEIDPVTNQ 
KASTKFWIKQKPISIDSDLLCAC\DLAEE 


6677 


277 


1678 


GNW PT E RMAFLDNP T 1 1 LAH I RQ S HVTS DDTGM CE M VL I DHD VD 
LEKIHPPSMPGDSGSEIQGSNGETQGYVYAQSVDITSSWDFGIR 
RRSNTAQRLERLRKERQNQI KCKN I QWKERNSKQSAQET.KSI..FE 
KKSLKEKPPISGKQSILSVRLEQCPLQLNNPFNEYSKFDGKGHV 
GTTATKKIDVYLPLHSSQDRLLPMTWTMASARVQDLIGLICWQ 
YTSEGREPKLNDNVSAYCLHIAEDDGEVDTDFPPLDSNEPIHKF 
GFSTLALVEKYSSPGLTSKESLFVRINAAHGFSLIQVDNTKVTM 
KE I LLKAVKRRKGSQKVSGSRADGVFEEDSQIDIATVQDMLSSH 
HYKS FKVSM I HRLR FTTD VQL / G CAL F PGVLRKRAAP VDCLRPS 
ADTWRQEQ I G CCG AACAALRS * DS HKC * EG I S G DKVE I DP VTNQ 
KASTKFWIKQKPISIDSDLLCAC\DLAEE 


6678 


221 


865 


GPSNQSSGSLSLI VTGCSS YWS * INDTCTILRVLSSNFGRQ* LR 
PFPCSQLPMSQGCLWHLDCCCPWVPYIPGQQWRKGRQRMRN*QS 
LLGSDQ3SVGLEDLCVFVNFLLHVLLGLFP*PHELFLLPWDLG 
FLFPLLLQGGCHCLVLPANLVSQAPQIGKLSCRLQTHDLEGSRN 
HHPLFLWGRWDAVKHLETVQSGLASLGFVGQHTSHGPP 


6679 


2 


786 


LEFARGAMPFLGQDWRSPGQNWVKTVDGWKRFLDEKSGSFVSDL 
SSYCNKEVYNKENL FNS LN YD / S CS QE E KEGHAE * QNQNS \ DFH 
QEKW I YVHKGSTKERHGYCTLGEAFNRLDFSTAI LDSRRFN YW 
RLLELIAKSQLTSLSGIAQKNFMNILEKWLKVLEDQQNITLIR 
ELLQTLYTSLCTLVKRVGKSVLVGNINMWVYRMETILHWQQQLN 
N I Q I TRVS GQAQ P P PG S GS LHRDTGQTRQDFE FTP VTEE SGL F 


6680 


1498 


2951 


PLCTLPLMPSALPGWAGERWEKQWPLA/ PGPGTWQTPVGSISEE 
P\RKNEPDTHCPRGEARPEV*HLPKPHSPGSEGAEIQTSA*ALP 
/NQVSPPQPM*GAEENGDQRGGKEEAGEELHRSSSGLTAAPGFP 
EVHRNLQTFPGLPSRGGGP/GGAGTQGSWAPGEQPP/SPLLPAS 
MQRSQAGLPGWEAGLVESPTHHIPALRPSGTNATGEAFPSTTCS 
SGP \ PAP PGPTGLRPGGGS S SGGHG * * PGLPVGKV\GALGAAQD 
PQSQGRGPTQGTVGTEMLLSGLGSAKACPAARPAVP*LPSDPAS 
TIP KKGTRGFGEGPG VLQERNRWWGRAQGFTSADAAGTAP PGV 
* LPAPLSQPPGATE PQVRACGMAP PS PGTSGRLVAWGRHPG PQV 
AQGCPPGAGCWGSQPRGSQRCPRTYTHSPLGHGRAPCPRRCWH+ 
WQDPPSSPRTGCLPGIPARQAYSAPRTRSRPGIRTGRAAYGFIR 
FQGGGGG 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine , R^Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y«Tyrosine, X-Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 


6681 


1169 


511 


INYIYYNQQQRAFHELK\EKLMSAPALGLPDLTKLFTLHVSERE 
KMT VGVLTQT VG P WSRPGAYLS KQLDG VS KG W P PCPRALAATAL 
LAQEADE L TL RQNLNRKS PHA \ WTL I NTKGHH * LI NARLTR YQ 
TLLCENPHKTIEVSNT/ LNPATLLLVTESPVKHKrCLBVT.n^wc; 

SRPNLRDHP * TSVDWELYVDGSGFANPCKVTLKKETS PAPVTPR 
S 


6682 


109 


1238 


TVLCGAMQVSSLNEVKIYSLSCGKSLPEWIiSDRKKRALQKKDVD™' 
VRRR I E L I Q D FEMP TVCTT I KVS KDGQ Y I LATGT YKPR VRCYDT 
YQLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRYIEFHSQSG 
FYYKTRIPKFGRDFSYHYPSCDLYFVGASSEVYRLNLEQGRYLN 
PLQTDAAENNVCDINSVHGLFATGTIEGRVECWDPRTRNRVGLL 
D\ AP * TVS QQ I QR * TSLPTI SALKFN\GALTMAVGTTTGQVLLY 
DLRSDKPLLVKDHOYGIjPTK<?VWPnnciT.nT.TT.QanCD tvvmwmv 
NSGKIFTSLE P EHDLNDVCL Y PNSGMLLTANE T P KMG I YY I P VL 
GPAPRWCSFLDNLTEELEENPESNE 


6683 


109 


1238 


TVLCGAMQVSSLNEVKIYSLSCGKSLPEWLSDRKKRALQKKDVD 
VRRR I EL I QD FEM P TVCTT I K VS KDGOY I LATGT YK P T? VP P vnT 

YQLSLKFERCLDSEWTFEILSDDYSKIVFLHNDRYIEFHSQSG 
F Y Y KTR I P K FGRDFS YHY P S CDL YFVGAS S E VYRLNLE QGR YLN 
PLQTDAAENNVCDINSVHGLFATGTIEGRVECWDPRTRNRVGLL 
D \ AP + T VSQQ I QR* TS LP T I SALKFN \GALTMAVGTTTGQVLLY 
DLRSDKPLLVKDHQYGLPIKSVHFQDSLDLILSADSRIVKMWNK 
NSGKIFTSLEPEHDLNDVCLYPNSGMLLTANETPKMGIYYIPVL 
GPAPRWCSFLDNLTEELEENPESNE 


6684 


111 


527 


GLRGGTSRGRAGREPEFAAGVLCWAGFCQSPCPPGGRGREAPA 
P P \ S G RRHA* RPA* WLGG PGGDSGGR E EGGS / G ELQRAM E S KMG 

RNIVQNYR 


6685 


258 


1473 


KLLGDNFEGFCNKFELSDSENGSNS*QSPL\FDRLFDPDPQKVL 
CX3VIDMKNAVIGNNKQKANLIVLGAVPRLLYLLQQETSSTELKT 
ECAWLGSLAMGTENNVKSLLDCHI I PALLQGLLSPDLKFIEAC 
LRCLRTIFTSPVTPEELLYTDATVIPHLMAT,T.QRQPVTnFVTr , n 

IFSHCCKGPDHQTILFNHGAVQNIAHLLTSLSYKVRMQALKCFS 
VLAFENPQVSMTLVNVLVDGELLPQIFVKMLQRDKPIEMQLTSA 
KCLT YMCRAGAI RTDDNC I VL KTL P CL VRMCS KE RLLE ER VEGA 
ETLAYL I E PDVELQR IAS I TDHL I AMLAD YFKY PSSVSAITDIK 
RLDHDLKHAHE LRQAAFKL YAS LGANDED I RKKVS LGEGR P P VL 
TASRQGVTST 


6686 


310 


927 


DSVTFDDLAVDFTPKEWTLLDPTQRNLYRDVMLENYKNLATVGY 
QLFKPSLISWLEQEESRTVQRGDFQASEWKVQLKTKELALQQDV 
LGEPTSSGIQMIGSHNGGEVSDVKQCGDVSSEHSCLKTHVRTQN 
SENTFECYLYGVDFLTLHKKTSTGEQRSVFSHVWKKPSSLNPDV 
VCQKNRCTRKKKAF * LQLTLGKSFH * S IHT 


6687 


181 


915 


EAMLEAPYKKEEDEQQRKEVKKDYPSNTTSSTSNSGNETSGSST" 

IGETSNRSRDRDRYRRRNSRSRSPGRQCRHRSRSWDRRHGSESR 

SRDHRREDRVHYRSPPLATGEPVDNLSPEERDARTVFCMQLAAR 

IRPRDLEDFFSAVGKVRDVRIISDRNSRRSKGIAYVEFCEIQSV 

PLAIGLTGQRLLG VP 1 1 VQASQAE KNR LAAMANNLQKGNGG P MR 

LYVGSLHFNITEDMLRGIFEPFGKV 


6688 


1025 


1 


AEVPNYPRVFHKCPDSCWKFKFQPIQLQPVlLLSFSSEKPPISF 
SEPGLPR/SATARMATAAAPPNSSIDLPSDSGMGFISPAGDSLD 
LPSDGGTGFFSLAGDSSSTRLSSLAFISFSLSSVSVGSSAGTTS 
STSVGSWAAFTSSSSSSTNRDVAGLDFSTVITSVSGSLVPSRE 
VAVICGSKGAGASGSASCSSRAGKTTEATAASSMPSGTSSFSTC 
TMSELEELFSLFSPAPLLSKLFTSSGSIAICCQDSGPSDTGRLS 
VCQLWLADSDTGKLSDCQEWTVGDSGGLTCPELSLGRM*MSLL 
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ID 
NO: 
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beginning 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine , G=K3lycine, 
HsHistidine, I=Isoleucine, IO=Lysine, 
L=Leucine , M=Methionine , N=Asparagine , 
P=Proline,' Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, +=Stop 
Codon, /^possible nucleotide deletion 
\«possible nucleotide insertion) 








SSAVIPGYSSSSDSRLNTVPTVDLLCPFQTKSST 


6689 


640 


1299 


SSSASYATSATSISDTAFSGSLKLKHGLLSALDSSSRTS*STSS 
AEDSTFRICSPSVSDTSSDSSGSKDNVLILFSKVSI*SCFSLSS 
FFSDSISFCFSSSSFCKR*FVSSKVSQNALLSSRLSNGPGGSSK 
QRNS LTARQLAMSL * ATKF * RNACNPNCLS S KKSAL * LS LNQR F 
GGSASRKPGNISFNSQKCSALSYCCNFVIKPREVSVSSENYPAF 


6690 


1 


442 


GTRGKMAATLGPLGSWQQWRRCLSARDGSRMLLLLLLLGSGQGP 
QQVGAGQTFEYLKREHSLSKPYOGVGTGSSSLWMLMnMaMVTMTn 

YIRLTPDMQSKQGALVJNRVPCFLRDWELQVHFKIHGQGKKNL\H 
GDGLAIWYTKDRMQP 


6691 


287 


1401 


LKTETSEEKARRYKDRPSQLNAVFQEQKKMIQAQESITLEDVAV 
DFTWEEWQLLGAAQKDLYRDVMLENYSNLVAVGYQASKPDALFK 
LEQGEQLWTIEDGIHSGACSDIWKVDHVLERLQSESLVNRRKPC 

GYEI KNSVEFTGNGDS FLHANHERLHTAI KFPASQKL I STKSQF 
I S P KHQKTRKLE KHHVC S E CGKAF I KKS WLTDHQ VMHTGE KPH R 
CSLCEKAFSRKFMLTEHQRTHTGEKPYECPECGKAFLKKSRLNI 
HQKTHTGEKPYICSECGKGFIQKGNLIVHQRIHTGEKPYICNEC 
/GKGFIQKTCLIAHQRFHTER 


6692 


178 


939 


n j.i\x.vjoijojji«r»Kr L_tti\x XKAUrMr KJilAr I MDCjNKRYAKKCQVE 
RQEGHSO^FNKLAETLRWCLNLGILEVTVYAFSIENFKRSKSEV 
DGLMDLARQKFSRLMEEKEKLQKHGVCIRVLGDLHLLPLDLQEL 
IAQAVQATKNYNKCFLNVCFAYTSRHEISNAVREMAWGVEQGLL 

OILiJJjIjIVJVUIjX Jl JNKOf fli'UlijX'Kl b«iiVKijoUFlgIjWQTSH 

SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6693 


178 


939 


WIKEGELSLWERFCANIIKAGPMPKHIAFIMDGNRRYAKKCQVE 
RQ EGH S QG FN KLAETLRW C LNLG I LE VTVYAFS I EN FKRS KS EV 
DGLMDLAROKFSRLMEEKEKTjOKHnvr'TRVT.nnT.vrr t dt ni ncr 

IAQAVQATKNYNKCFLNVCFAYTSRHEISNAVREMAWGVEQGLL 
DPSDISESLLDKCLYTNRSPHPDILIRTSGEVRLSDFLLWQTSH 
SCLVFQPVLWPEYTFWNLFEAILQFQMNHSVLQK 


6694 


292 


813 


SLLLHLAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR 
EVHSLGQILPQDGLTAEAGPPEAQDPWGSPGISLPAAHIGFAAA 
LAVG P S GCHTE P \ FDE VW P S L FLCTD A Y A AR n K <% K"T . T m rz r Tvnn r 

NAAAGKFQ VDTGAKFYRGMS LE Y YG I EADDNP FFDLS VY FL P 


6695 j 


292 


813 


SLLLHliAPPGAYTPSQPLSSVSTETASSVRRQAAESRQHELPVR 
EVHSLGQILPQDGLTAEAGPPEAQDPWGSPGISLPAAHIGFAAA 
LAVG PSGCHTEP \ FDEVWPS LFLGDAYAARDKS KL I QLG ITHW 
NAAAGKFQVDTGAKFYRGMSLE YYG I EADDNP FFDLS VYFLP 


6696 


1 


782 1 


PRVRGRVGERWAFLSVPAAM < 3 c ?FMF , PT.T.T.&WQVFPPDT<'WriT r-rirT 

LCTQMLEKSPYDQAAWILKARALTEMVYIDEIDVDQEGIAEMML 
DENAIAQVPRPGTSLKLPGTNQTGGPSQAVRPITQAGRPITGFL 
RPSTQSGRPGTMEQAIRTPRTAYTARP1TSSSGRFVRLGTASML 
TSPDGPFINLSRLNLTKYSQKPKLAKALIEYIFHHENDVKTALD 
LAALSTEHSQYKD WWWK/ DQ I EKCYYRVGMYREAEKQIKSS 


«97 


3 


782 


P PLFLRRLNS RAL R PGSR K VMA WPAS LSGQDVGS FAYLTI KDR 
IPQILTKVIDTLHRHKSEFFEKHGEEGVEAEKKAISLLSKLRNE 
LQTDKPFI PLVEKFVDTD I WNQ YLE YQQSLLNES DGKSRWF YS P 
WLLV\ECYMYRRIHEAI\IQSPPIDYFDVFKESKEQNFYGSQES 
IIALCTHLQQLIRTIEDLD\ENQLKDEFFKLLQISLWGEISVDL 
SL\SGGESSSQNTNVLNSLEDLKPFILLNDMEHLWSLLSNCK 


6698 




754 


VGSCACAGSCKCKECKCTSCKKSECRAFP 


6699 


325 


492 


EGELP / PARRVLPRAMTASAQPRGRRPGVGVGVWTSCKHPRCV 
LLGKRKGSVGAGSFQLPGGHLEFGETWEECAQRETWEEAALHLK 
NVHFASVVNSFIEKENYHYVTILMKGEVDVTHDSEPKNVEPEKN 



541 



WO 01/53312 



PCT/US00/34263 



SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
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Predicted end 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, Es 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ESKRI I YNHAFFFQESKWSGGILQ 


6700 


1098 


1392 


TQCWRS STPGMRTHFRTQP / RLECGQGFSQQENGHCMDTNECIQ 
FPFVCPRDKPVCVNTYGSYRCRTNKKCSRGYEPNEDGTACVERT 
LLLGLCNLLGK 


6701 


2 


1485 


AAAGPRTRVRRAAAFEGQPSPSPGLGPTSDKAAAPRTPKRRRLW 
RQRQ/HPAMLCYVTRPDAVLMEVEVEAKANGEDCLNQVCRRLGI 
IEVDYFGLQFTGSKGESLWLNLRNRISQQMDGLAPYRLKLRVKF 
FVEPHLILQEQTRHIFFLHIKEALLAGHLLCSPEQAVELSALLA 
QTKFGDYNQNTAKYNYEELCAKELSSATLNSIVAKHKELEGTSQ 
ASAE YQVLQI VSAMENYGI EWHSVRDSEGQKLL IGVGPEG I S IC 
KDDFS P I NR I AY P WQMATQSGKNV YLTVTKESGNS I VLLFKMI 
STRAASGLYRAITETHAFYRCDTVTSAVMMQYSRDIiKGHLASLF 
LNEN I NLG KKYVFD IKRTS KE VYDHARRALYNAG WD LVS RNNQ 
SPSHSPLKSSESSMNCSSCEGLSCQQTRVLQEKLRKIjKEAMLCM 
VCCEEEINSTFCPCGHTVCCESCAAQLQVGESAAHFCLQPHLSL 
LLTGSRSQVLAR 


6702 


397 


1971 


RAEALLCSRKATWRDLVAVRMAEEQEFTQLCKLPAQPSHPHCV 
NNTYRSAQHSQALLRGLLALRDSGILFDWLWEGRHIEAHRIL 
LAASCDYFKGMFAGGLKEMEQEEVLIHGVSYNAMCQILHFIYTS 
ELELSLSNVQETLVAACQLQIPEIIHFCCDFLMSWVDEENILDV 
YRLAELFDLSRLTEQLDTYILKNFVAFSRTDKYRQLPLEKVYSL 
LSSNRLEVSCETEVYEGALLYHYSLEQVQADQISLHEPPKLLET 
VRFPLMEAEVLQRLHDKLDPSPLRDTVASALMYHRNESLQPSLQ 
SPQTELRSDFQCWGFGGIHSTPS\MSSATRPKYLNPLLGEWKH 
FTASLAPRMSNQG I AVLNNFVYLIGGDNNVQGFRAES RCWRYDP 
RHNRWFQIQSLQOEHADLSVCWGRYIYAVAGRDYHMnT,NAVPP 

YDPATNSWAYVAPLKREVYAHAGATLEGKMYITCGRKGRIT 


6703 


45 


1244 


GVGPRAAAMPLELEIiCPGRWVGGQHPCFIIAEIGQNHQGDIiDVA 
KRMIRMAKECGADCAKFQKSELEFKFNRKALERPYTSKHSWGKT 
YGEHKRHLE FSHDQ YRELQR YAE EVG I FFTASGMDEMAVE FLHE 
LNVPFFKVGSGDTNNFPYLEKTAK/TRGWHSVLRDVCGVQLNDE 
TSSWDVLGRVRTSKEKVLMVLVLDYSGRPMVISSGMQSMDTMKQ 
VYQ I VKP LN PNF C FLQCTS AY P LQ PE D VNLR V ISEYQKLFPDIP 
IGYSGHETG I AIS VAAVALGAKVLERH ITLDKTWKGSDHSASLE 
PGELAELVRSVRLVERALGSPTKQLLPCEMACNEKLGKSVVAKV 
KIPEGTILTMDMLTVKVGEPKGYPPEDIFNLVGKKVLVTVEEDD 
TIMEE 


6704 | 


82 


1007 


TMNTRNRWNSGLGAS PASRPTRDPQDPSGRQGELS PVEDQREG 
LEAAPKGPSRESWHAGQRRTSAYTLIAPNINRRNEIQRIAEQE 

YKQKLKREESVRIKKEAEEAELQKMKAIQREKSNKLEEKKRLQE 
NLRREAFR E HQQ Y KTAE FL /RQTEHR IARQKCL S KC CLW PT I LN 
MGQKLGLQ\DSLKAEENRKLQKMKDEQHQKSELLELKRQQQEQE 
RAK I HQTEHRR VNNA FLDRLQGKS QPGGLEQS GGCWNMNS GN S W 
GI 


6705 


2 


786 


RLCRNSARVPCGWSASRSLGEGAGFIGPLRGPHPRAGGTGTSFT 
SYKRKGGIMSTIAAFYGGKSILITVATGFLGKELMEKLFRTSPD 
LKV I YI LVRPKAGQTLQHRVFQ I LDS KLFE KV I E VR PNVHEK1 R 
AI YADLNQNDFAI S KEDMQELLS CTNI I FHCAAT VRFDD TLRHA 
VQLNVTATRQLLLMASQMPKLEAFIHISTAYSNCNLKHIDEVIY 
PCPVEPKKIIDSLEW\LDDAIIDEITPKLIRDWPNIYTYTK 


6706 


130 " 


531 


FTHSSSSHSQEMIX3KLNMLRNDGHFCDITIRVQDKIFRAHKWL 
AACS D F FRT KLVGQAEDENKNVLDLHHVTVTG F I PLLE YAYTAT 
LSINTENIIDVLAAASYMQMFSVASTCSEFMKSSILWNTPNSQP 
EK 
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NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
cor re sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aepartic Acid, E- 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, YsTyrosine, X=Unknov/n, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6707 


2233 


1343 


YWSGIGYELQHFHWRKFHFEKKGPPSTCQERLYESRSRWPCIS* 
GMVWGWTAVNGSW*GGQLRCVCVCTSHSSDSTRSSQRASKCHS 
FFILSQ*KT+SSWENWVFAKYSRIYSYGHSCSKGRGD*DFK*NV 
SQAR*SRFCGLCNPCGHCGLDINLRGGSSPWTDKHSCVHNNLLC 
NRRVFSLLCEGPGHCYQGAVCREACAAASPGLDSAAEPHRLCEH 
TD*LPK*GPGYIQHFHCDSNILCILYNISFNLFSYSF*GVARYA 
C*RCHWYFEWLLYNHCGDILVACL*RRQL*SSQ 


6708 


115 . 


1729 


TVGSWSRSGRSPPVGRQLIiLTGRGAQAAGSPQGGMALQVELVPT 
G E 1 1 R WH PHR PCKLALGS DG VR VTMESALTARDR VG VQDFVLL 
ENFTSEAAFIENLRRRFRENLIYTYIGPVLVSVNPYRDLQIYSR 
QHMERYRGVSFYEEPPHLLAVADTVYRALRTERRDQAVMISVES 
GAGKTDATKRLLQLYAETCPAPQRGGAVRDRLLQSNPVLEAFGN 
AKTLRNDNSSRFGKYMDVQFDFKGAPVGGHILSYLLEKSRWHQ 
NHGERNFHIFYQLLEGCEEETLRRLGLERNPQSYLYLVKGQCAK 
VSSINDKSDWKWRKALTVIDFTEDEVEDLLSIAASVLHLGNIH 
FAANEESNAQVTTENQLKYLTRLLSVEGSTLREALTHRKIIAKG 
EELLS PLNLEQAAYARDALAKAVYSRTFTWLVGKI NRSLAS KDV 
ES P S WRS TT VLG LLD I YG FE VFQHNS FEQ FC I N YCNE KLQQ L F I 
ELTLKSEQEEY EAEG I AWE P VQ Y FNNKI I CDL VEE KF KG 1 1 \S I 
LDE\ECLRPGE 


6709 


3 


894 


PPHEHLFPSGERGPFSFLVSRRGLGPGKMGKKGKKEKKGRGAEK 
TAAKMEKKVSKRSRKEEEDLEALIAHFQTLDAKRTQTVELPCPP 
PSPRLNASLSVHPEKDELILFGGEYFNGQKTFLYNELYVYNIRK 
DTWTKVDIPSPPPRRCAHQAWVPQGGGQLWVFGGEFASPNGEQ 
FYHYKDLWVLHLATKTWEQVKSTGGPSGRSGHRMVAWKRQLILF 
GG FHE S TRDY I Y YND VY AFNLDTFTW S KLS PS GTG PT PR S G CQ \ 
I PSLPRAASSVYGGYSKQRVKKDVDKGTRHSDMF 


6710 


158 


980 


RHKMTNYRVESSSGRAARKMRLALMGPAFIAAIGYIDPGNFATN 
I QAGAS FG YQLLWVWWANLMAML I Q I LS AKLG I ATGKNLAEQI 
RDH YPRPWW FYWVQAE 1 1 AMATDLAE F IGAAI GFKLI LGVSLL 
QGAVLTGIATFLILMLQRRGQKPLEKVIGGLLLFVAAAYIVELI 
FSQPNLAQriGKGMVIPSLPTSEAVFIiAAGVLNGATIMPHVl/YI 
WHSSLTQHLHGGSRQQRYSATKWDVAIAMTIAGFVNLAIMATAA 
SELNFYGHTGVA 


6711 


3 


347 


VTECKTMTCKMSQLERNI*TMINTLHHYSVKLGHPDTLIHGEFK 
E L VRTD LHN I LM KENKNDQAI * H I ME DLDTNAHMQ 1 1 FKE L I ML 
MAMLTWSYHDNM1IDADYGPGQQHRPG 


6712 


lie 


578 


PHGQKRTRYPQVRAPGQQPQAQLAMALCLKQVFAKDKTFRPRKR 
F E PGTQRFE L YKKAQAS L KSGLDLR S WRLP PGEN I DDW I AVHV 
VD FFNR I NL I YGTMAE RC S * T S CP VMAGG PR YE YRWQDBRQ YRR 
PAKLSAPRYMALLMDWIESLI 


<?713 


2485 


3 


QARGSDSEDGEFEIQAEDDARARKLGPGRPLPTFPTSECTSDVE 
PDTRE M VRAQNKKKKKSGGFQ S MGL S YP VFKG I M KKG YKVP TP I 
QRKTIPVILDGKDWAMARTGSGKTACFLLPMFERLKTHSAQTG 
ARALILSPTRELALQTLKFTKELGKFTGLKTALILGGDRMEDQF 
AALHENPDI I IATPGRLVHVAVEMSLKLQS VE YVVFDEADRLFE 
MG FAEQLQE 1 I ARLPGGHQTVL FSATLP KLLVE FARAGLTEPVL 
IRLDVDTKLNEQLKTSFFLVREI^KAAVIiLHLLHNVVRPQDQTV 
VFVATKHHAEYLTELLTTQRVSCAHIYSALDPTARKINLAKFTL 
G KC S TL I VTDLAARGLD I P LLDNV I NY S FPAKGKL FLHR VGRVA 
RAGRSGTAYSLVAPDEIPYLLDLHLFLGRSLTLARPLKEPSGVA 
GVDGMLGRVPQSWDEEDSGLQSTLEASLELRGLARVADNAQQQ 
YVRSRPAPSPES I KRAKEMDLVGLGLHPLFSSRFEEEELQRLRL 
VDS I KNYRSRATI FE INASSRDLCSQVMRAKRQKDRKAIARFQQ 
GQQGRQEQQEGPVGPAPSRPALQEKQPEKSEEEEAGESVEDIFS 
EWGRKRQRSGPNRGAKRRREEARQRDQEFYIPYRPKDFDSERG 
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' ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A*=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, NaAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *-Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








LS ISGEGGAFEQQ7VAGAVLDLMGDEAQNLTRGRQQLKWDRKKKR 
FVGQSGQEDKKKIKTESGRYISSSYKRDLYQKWKQKQKID*S*L 
GRRRGILTRRRPRTEEVGEARPLAQAGCIPGPHAPRHPLQAESA 
LELKTKQQILKQRRRAQKAALSLQRWWPQAALCPQ 


6714 


169 




NNCQELLPPPPAPMAHIPSGGAPAAGAAPMGPQYCVCKVELSVS 
GQNLLDRDVTSKSDPFCVLFTENNGRWIEYDRTETAINNLNPAF 
S KKFVLDYHFEE VQKLKFALFDQDKS SMRLDEHDFLGQFSCSLG 
TIVSSKKITRPLLLLNDKPAGKGLITIAAQELSDNRVITLSLAG 
RRLDKKDLFG KS DP FLE FYKPGDDGKWMLVHRTEVI KYTLDPVW 
KPFTVPLVSLCDGDMBKPIQVMCYDYDNDGGHDFIGEFQTSVSQ 
MCEARDSVPLEFECINPKKQRKKKNYKNSGIIILRSCKINRDYS 
FLDYILGGCQLMFTVGIDFT7VSNGNPLDPSSLHYINPMGTNEYL 
SAIWAVGQIIQDYDSDKMFPALGFGAQLPPDWKVSHEFAINFNP 
TNPFCSGVDGIAQAYSACLP 


6715 


32 


493 


GPAGAESGSLHCLPATVQALiAGAAHSPHGGQPPRRGPLIGSGMP 
GKPKHLGVPNGRMVLAVSDGELSSTTGPQGQGEGRGSSLSIHSL 
PSGPSSPFPTEEQPVASWALSFERLLQDPLGLAYFTEFLKKEFS 
AENVTFWKACERFQQ I PASDT 


6716 


1 


17* 


GAGGPAPRSFGSEEPRAALERDKMSARAAAAKSTAMEETAI WEQ ' " 
HTVTLHRVSLCCSK 


6717 


115 


! 896 


LFAMSGFENLNTDFYQTSYSIDDQSQQSYDYGGSGGPYSKQYAG 
YDYSQQGRFVPPDMMQPQQPYTGQIYQPTQAYTPASPQPFYGNN 
FEDEPPLLEELGINFDHIWQKTLTVLHPLKVADGSIMNETDLAG 
PMVFCLAFGATLLLAGKIQFGYVYGISAIGCLGMFCLLNLMSMT 
GVSFGCVASVLGYCLLPMILLSSFAVI FSLQGMVGIILTAGIIG 
W CS FS ASKI F I SALAMEGQQIiLVAYP CALLYGVFALI S VF 


6718 


290 


599 


KQSSTVPGTILPSLKWHNSGLCKFPETGGKMTTFKEGLTFKDVA 
V I FTE EELGLLD PVQRNL YQD VMLENFRNLLS VGHH P FKH D VFL 
LE KE KKLD I M KTATQ 


| 6719 


1 


691 


PTRPEEQDREDGKCHKMEMNPISGNLNCDPIAMSQCSSDHGCET 
DLDS DDDK IE KPNNFMKDS AS QDNGLSRKI SR KRVCS SDSVSSL 
Q WKKS S KARTG LLR I TRRCAATAAN K I KLMS D VED VS LENVHT 
RSKNGRKKPLHLACTTAKKKLSDCEGSVHCEVPSEQYACEGKPP 
DPDSEGSTKVLSQALNGDSDSEDMLNSEHKHRHTNIHKIDAPSK 
RKSSSVTSSG 


6720 


3 


822 


HEVAEEAGGTVYPQRGTMPGTKRFQHVIETPEPGKWELTGYEAA 
VPITEKSNPLTQDLDKADAENIVRLLGQCDAEIFQEEGQALSTY 
QRLYS ES ILTTMVQVAGKVQEVLKEPDGGLWLSGGGTSGRMAF 
LMSVS FNQLMKGLGQKPLYTYLIAGGDRSWASREGTEDSALHG 
IEELKKVAAGKKRVIVIGISVGLSAPFVAGQMDCCMNNTAVFLP 
VLVG FN P VSMARH P F P P PR I LRS LT VF P S LRAPH YQ I TSLL FSM 
SWTLISE 


6721 


3 


822 


HEVAEEAGGTVYPQRGTMPGTKRFQHVIETPEPGKWELTGYEAA ' 

VPITEKSNPLTQDLDKADAENIVRLLGQCDAEIFQEEGQALSTY 

QRLYS ESI LTTMVQVAGKVQE VLKEPDGGLWLSGGGTSGRMAF 

LMSVSFNQLMKGLGQKPLYTYLIAGGDRSWASREGTEDSALHG 

IEELKKVAAGKKRVIVIGISVGLSAPFVAGQMDCCMNNTAVFLP 

VLVGFNPVSMARHPFPPPRILRSLTVFPSLRAPHYOITSLLFSM 

SWTLISE 


6722 


1 


390 


RSWSKRTWQALPMAVLFLLLFLCGTPQAABNMQAIYVAtGEAVE 
LPCPSPSTLHGDEHLSWFCSPAAGSFTTLVAQVQVGRPAPDPGK 
PGRESRLRLLGNYSLWLEGSKEEDAGRYWCAVLGQHHNYQNW 


6723 


173 


659 


VCQYCTARMADFGISAGQFVAWWDKSSPVEALKGLVDKLQALT 
GNEGR VS VENI KQL LQS AHKES S FDI I LS GLVPGS TTLHSAE I h 
AE IAR I LRPGGCLFLKEPVETAVDNNS KVKTASKLCSALTLSGL 
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ID 
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nucleotide 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
cor responding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E» 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








VEVKELQREPLTPEEVQSVREHLGHESDNL 


6724 


173 


659 


VCQ YCTARMAD FG I S AGQ F VAWWD KS S PVEALKGLVDKLQALT 
GNEGRVS VEN I KQLLQS AHKES S FD 1 1 LSGLVPGSTTLHSAE I L 
AEIARILRPGGCLFLKEPVETAVDNNSKVKTASKLCSALTLSGL 
VEVKELQREPLTPEEVQSVREHLGHESDNL 


6725 


356 


722 


RRRTPPVILATMDDDLMLALRLQEEWNLQEAERDHAQESLSLVD 
ASWELVDPTPDLQALFVQFNDQFFWGQLEAVEVKWSVRMTLCAG 
I CS YEGKGGMCS IRLSEPLLKLRPRKDLVE VFFV 


6726 


98 


714 


HLOKMERKINRREKEKEYEGKHNc?T.pryrnfv^irKrnK'C3rp 7 , wf rT 

GYLYITQKQTLTKYPDTFLEGIVNGKILCPFDADGHYFIDRDGL 
LFRHVLNFLRNGELLLPEGF'RENQLIiAQEAEFFQLKGLAEEVKS 
RWEKEQLTPRETTFLEITDNHDRSQGLRIFCNAPDFISKIKSRI 
VLVSKSRLDGFPERF'? T <? <?mt JnwwwTV 


6727 


1 


831 


FRGMGDERPHYYGKHGTPQKYDPTFKGPIYNRGCTDIICCVFLL 
LAIVGYVAVGIIAWTHGDPRKVIYPTDSRGEFCGQKGTKNENKP 
YLFYFNIVKCASPLVLLEFQCPTPQICVEKCPDRYLTYLNARSS 
RDFEYYKQFCVPGFKNNKGVAEVLRDGDCPAVLIPSKPLARRCF 
PA I HA YKG VLMVGNE TT Y E DGHGS RKN I TDL VEGAKKANG VLEA 
RQLAMRIFEDYTVSWYWDI ISLGIAMAMSLLFI ILLRFLAGIMG 
RGMI IMGILVLGY 


6728 


486 


935 


r\.oowijKoijAUboLoWlu v ]r bvbb JGGIASGKSSVIQVFQQLGCA 
V I DVDVMARH WQ PGYPAHRR I VE VFGTEVLLENGDINRKVLGD 
LIFNOPDRROLLNATTHP'FTPK'PlvlMV'CT'Pi^V'PT D?DDi<c!nDnvv 
HVPSALKEADSLMRRDT 


6729 


259 


1191 


VGLTGAQSGRTASMGRDQRAVAGPALRRWLLLGTVTVGFLAQSV 
LAGVKKFDVPCGGRDCSGGCQCYPEKGGRGQPGPVGPQGYNGPP 
GLOGF PGLOGRKGDKGERR APRVTfJP jcrcnvp a ppvcnwor* a nn t 

PGHPGQGGPRGRPGYDGCNGTQGDSGPQGPPGSEGFTGPPGPQG 
PKGQKGEPYALPKEERDRYRGEPGEPGLVGFQGPPGRPGHVGQM 
GPVGAPGRPGPPGPPGPKGQQGNRGLGFYGVKGEKGDVGQPGPN 
GIPSDTLHPIIAPTGVTFHPDQYKGEKGSEGEPGIRGISLKGEE 
GIM 


6730 


784 


1015 


NMVDYYEVLGLQRYASPEDIKKAYHKVALKWHPDKNPENKEEAE 
RKFKEVAEAYEVLSNDEKRDIYDKYGTEGLNEF 


673X 


i ; 


446 


GIRKRLHGAWPRVEVGCPWETRESEGVHLERPTSPLKNNDEGS 
LDIYAGLDSAVSDSASKSCVPSRNrrT.nT.vPB'TT tcpp^rvuxtv 

NDLQVEYGKCQLQMKELMKKFKEIQTQNFSLINENQSLKKNISA 
LIKTARVE1NRKDEEI 


6732 


102 


1205 


GRWQRRP P PPS PPLWCLQ PGGGSDPQQLTQLRHCLSHS PQDTPW 
AQRQVCYTAATTQAAAPATRNCLPDHSGHRPTPPRSHRHHRQEN 
LG S I KP S S R STKATS TTMAGDGRRAE AVREGWG VYVT PRAP I RE 
ui^uKJArti^v^wvjvjaolJAi'AxK Jl Pr'bKQGRKEVRr SDEPPEVYGDFE 
PLVAKERSPVGKRTRLEEFRSDSAKEEVRESAYYLRSRQRRQPR 
PQETEEMKTRRTTRLQQQHSEQPPLQPSPVMTRRGLRDSHSSEE 
DEASSQTDLSQTISKKTVRSIQEAPAVSEDLVIRLRRPPLRYPR 
Y E ATS VQQ KVNFS E EGE TE EDDQDS SHS S VTT VKARS RDS DE SG 
DKTTRSSSQYIESFW 


■'6733 ' 


613 


1311 


RSCRQVGMRSRNQGGESASDGHISCPKPSIIGNAGEKSLSEDAK 
KKKKSNRKEDDVMASGTVKRHLKTSGECERKTKKSLELSKEDLI 
QLLSIMEGELQAREDVIHMLKTEKTKPEVLEAHYGSAEPEKVLR 
VLHRDAI LAQEKS IGEDVYEKP I SELDRLEEKQKETYRRMLEQL 
LliAEKCHRRTVYELEWEKHKHTDYMNKSDDFTNLLEQERERLKK 
LLEQEKAYQARKE 


6734 


189 


551 


SAAMFPVFSGCFQELQEKNKSLELVSFEEVAVHFTWEEWQDLDD 
AQRTLYRDVMLETYSSLVSLGHCITKPEMIFKLEQGAEPWIVEE 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q*Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=0nknown, *»Stop 
Codon, ( /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








TLNLRLSGGSKKQVFSGICHRSLVELQEVHLV 


6735 


280 


558 


KSRRAGVTKMSNPFIjKQVFNKDKTFRPKRKFEPGTQRFELHKKA 
QAS LNAGLDLRLAVQLPPGEDLND WAVHVVDFFNRVNLI YGT I 
XDGCT 


6736 


195 


808 


MNYELNFKREMPNIKSLGLTNLNFLLKRLSSVLPLITDYVYFEN 
S SSNPYLIRRI EELNKTASGNVEAKWCFYRRRDI SNTLIMLAD 
KHAKE I EBES ETTVEADLTDKQ KHQLKHRELFLSRQ YES LPATH 
IRGKCSVALLNETESVLSYLDKEDTFFYSLVYDPSLKTLLADKG 
E I R VGPRYQAD I PEMLLEGTFFCVFAVL 


6737 


150 


1209 


PVIMPLHFS PGDI VRPSCCVSSS PKLRRNAHSRLES YRPDTDLS 
REDTGCNLQHISDRENIDDLNMEFNPSDHPRASTIFLSKSQTDV 
REKRKSLFINHHPPGQIARKYSSCSTIFLDDSTVSQPNLKYTIK 
CVALAIYYHIKNRDPDGRMLLDIFDENLHPLSKSEVPPDYDKHN 
PEQKQIYRFVRTLFSAAQLTAECAIVTLVYLERLLTYAEIDICP 
AN W KR I VLG A I LLAS KVWDD QAVWNVD Y CQ I LKD I TVEDMNELE 
RQFLELLQFNINVPSSVYAKYYFDLRSLAEANNLSFPLEPLSRE 
RAHKLEAISRLCEDKYKDLRRSARKRSASADNLTLPRWSPAIIS 


6738 


148 


*53 


CACAEQPARAEVGAATALPVRWASGEMAPSGSLAVPLAVLVLLL 
WGAPWTHGRRSNVRVITDENWRELLEGDWMIEFYAPWCPACQNL 
QPEWESFAEWGEDLEVNIAKVDVTEQPGLSGRFIITALPTIYHC 
KDGEFRRYQGPRTKKDFINFISDKEWKSIBPVSSWF 


6739 


3 


631 


SWPDMAEEEVAKLEKHLMLLRQEYVKLQKKLAETEKRCALLAAQ 
ANKE S S S E S F I S R LLA I VADL YEQEQ YSDLK I KVGDRHI S AHKF 
VLAARSDSWSLANLSSTKELDLSDANPEVTMTMLRWIYTDELEF 
REDDVFLTELMKLANRFQLQLLRERCEKGVMSLVNVRNCIRFYQ 
TAEELNASTLMNYCAEIIASHWVSEVEGVNKAL 


6740 


3 


631 


SWPDMAEEEVAKLEKHLMLLROEYVKLOKKIjAFTPKTJrAT.T.azvn 
ANKESSSESFISRLLAIVADLYEQEQYSDLKIKVGDRHISAHKF 
VLAARSDSWSLANLSSTKELDLSDANPEVTMTMLRWIYTDELEF 
REDDVFLTELMKLANRFQLQLLRERCEKGVMSLVNVRNCIRFYQ 
TAEELNASTLMNYCAEIIASHWVSEVEGVNKAL 


6741 


141 


960 


PLTLPFSSRARAGHTMNTSPGTVGSDPVILATAGYDHTVRFWQA 
HSGICTRTVQHQDSQVNALEVTPDRSMIAAAVQPVSLGYQHIRM 
YDLNSNNPNP 1 1 S YDGVNKN I ASVG FHSDGRWMYTGGEDCTARI 
W DLRS RN LQCQR I FQ VNAP I NCVCLH PNQAEL I VGDQSG A I H I W 
DLKTDHNEQL I PE PE VS I TS AH IDPDAS YMAAVNS TLVPFS CLL 
PLAIGILQEGEFESLARRGLLFLACQGNCYVWNLTGGIGDEVTQ 
LIPKTKIP 


6742 


141 


££0 


PLTLPFS SRARAGHTMNTS PGT VGSDP VI LATAG YDHTVR FWQ A 
HSGICTRTVQHQDSQVNALEVTPDRSMIAAAVQPVSLGYQHIRM 
YDLNSNN PNP IIS YDGVNKNIAS VGFHEDGRWMYTGGEDCTAR I 
WDLRSRNLQCQRIFQVNAPINCVCLHPNQAELIVGDQSGAIHIW 
DLKTDHNEQL I PE PE VS ITS AH I DPDAS YMAAVNS TLVP FS CLL 
PLAIGILQEGEFESLARRGLLFLACQGNCYVWNLTGGIGDEVTQ 
LIPKTKIP 


6743 


1 


412 


MHSTQDKSLHLEGDPNPSAAPTSTCAPRKMPKRISISKQLASVK 
ALRKCSDLEKAIATTALIFRNSSDSDGKLEKAIAKDLLQTQFRN 
FAEGQETKPKYREILSELDEHTENKLDFEDFMILLLSITVMSDL 
LQNIR ' 


6744 


95 


1343 


R TPARNRCAGCE VL S R FS S PNKAS S FALQSAGGG LP A VRALR RD 
RQKVSTVGYGMDEVEQDQHEARLKELFDSFDTTGTGSLGQBELT 
DLCHMLSLEEVAPVLQQTLLQDNLLGRVHFDQFKEALILILSRT 
LSNEEHFQEPDCSLEAQPKYVRGGKRYGRRSLPEFQESVEEFPE 
VT V I E PLDEEAR P S H I P AGD CS EHWKTQRSE E YE AEGQLR FWNP 
DDLNASQSGSSPPQDWIEEKLQEVCEDLGITRDGHLNRKKLVSI 
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location 
corresponding 
to fiarst 
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location 
corresponding 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine / C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, " 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\»possible nucleotide insertion) 








CEQYGLQNVDGEMLEEVFHNLDPDGTMSVEDFFYGLFKNGKSLT " 
PSASTPYRQLKRHLSMQSFDESGRRTTTSSAMTSTIGFRVFSCL 
DDGMGHAS VER I LDTWQEEG I ENSQE ILKALDFGLDGNINLTEL 
TLALEN E LLVTKNS I HQAC I 


6745 


1 


588 


TFRDQGWAQRRRWLLGCASWESWEAAIAAGPGLPSSTARQQNNP ~ 

AAGTECFAAVWARGTAMGSVLSTDSGKSAPASATARALERRRDP 

ELPVTSFDCAVCLEVLHQPVRTRCGHVFCRSCIATSLKNNKWTC 

PYCRAYLPSEGVPATDVAKRMKSEYKNCAECDTLVCLSEMRAHI 

RTCQKYIDKYGPLQELEETA 


6746 


110 


492 


GATGAMAESAPARHRRKRRSTPLTSSTLPSQATEKSSYFQTTEI "" 

SLWTWAAIQAVEKKMESQAARLQSLEGRTGTAEKKLADCEKMA 

VEFGNQLEGKWAVLGTLLQEYGLIiQRRLENVENLLRNRN 


6747 


247 


484 


EAVTFKDVAWFTEEELGLLDLAQRKLYRDVMLENFRNLLSVGH " 
QP FHRDTFHFLREE KFWMMD I ATQREGNS VYAGVC 


6748 


201 


665 


MTTFKEAVTFKDVAWFTEEELGLLDPAQRKLYRDVMLENFRNL 
LSVGNQPFHQDTFHFLGKEKFWKMKTTSQREGNSGGKIQIEMET 
VPEAGPHEEWSCQQIWEQIASDLTRSQNSIRNSSQFFKEGDVPC 
Q I EARLS I SXVQQX P YRCNECKQ 


6749 


95 


719 


RREVKGGDGVCPRARGSPQSQQFPSCAGGGEGLQQSGEALDGAM " 

SAGGPCPAAAGGGPGGASCSVGAPGGVSMFRWLEVLEKEFDKAF 

VDVDLLLGEIDPDQADITYEGRQKMTSLSSCFAQLCHKAQSVSQ 

INHKLEAQLVDLKSELTETQAEKWLEKEVHDQLLQLHSIQLQL 

HAKTGQSADSGTIKAKLSGPSVEELERELKAN 


6750 


3 


428 


SCESRRPGAKWVWASGALPRDTTGLGSEQPSGDVAQSNRATMGT " 
TAPGPIHLLELCDQKLMEFLCNMDNKDLVWLEEIQEEAERMFTR 
EFSKEPELMPKTPSOKNRRKKRRISYVODRMRnPTRRRT coowc 
RSSQLSSRR 


6751 


152 


1417 


PTKATEMAGASVKVAVRVRPFNSREMSRDSKCllQMSGSTTTIV 
NPKQPKETPKSFSFDYSYWSHTSPEDINYASQKQVYRDIGEEML 
QHAFEGYNVCIFAYGQTGAGKSYTMMGKQEKDQQGIIPQLCEDL 
FSRINDTTNDNMSYSVEVSYMEIYCERVRDLLNPKNKGNLRVRE 
HPLLGPYVEDLSKLAWSYNDIQDLMDSGNKARWAATNMNETS 
SRSHAVFNI I FTQ KRH D AE TN I TTE K VS K I S LVD LAGS ERADS T 
GAKGTRLKEGANINKSLTTLGKVISALAEMDSGPNKNKKKKKTD 
FIPYRDSVLTWLLRENLGGNSRTAMVAALSPADINYDETLSTLR 
YADRAKQIRCNAVINEDPNNKLIRELKDEVTRLRDLLYAQGLGD 
ITDMTNALVGMSPSSSLSALSSRNV 


6752 


24 


1834 


RNCVPPLGCYRSRVKFHSDIKMQYSHHCEHLLERLNKQREAGFL 
CDCTIVIGEFQFKAHRNVLASFSEYFGAIYRSTSENNVFLDQSQ 
VKADGFQKLLEFIYTGTLNLDSWNVKEIHQAADYLKVEEWTKC 
KIKMEDFAFXANPSSTEISSITGNIELNQQTCLLTLRDYNNREK 
SEVSTDLIQANPKQGALAKKSSQTKKKKKAFNSPKTGQNKTVQY 
PSDILENASVELFLDANKLPTPWEQVAQINDNSELELTSWEN 
TFPAQDIVHTVTVKRKRGKSQPNCALKEHSMSNIASVKSPYEAE 
NSGEELDQRYSKAKPMCNTCGKVFSEASSLRRHMRIHKGVKPYV 
CHLCG KAFTQCNQLKTHVRTHTGEKP YKCELCDKG FAQKCQLVF 
HS RMHHGE E KP YKCD VCNLQ FATS SNLK I HARKHS G E KP YVCDR 
CGQR F AQAS TLT YHVRRHTG E KP YVCDT CG KAFAVS S S L I THS R 
KHTGE KP F I CELCGNS YTD I KNL KKH KT KVHSG AD KTLDS SAED 
HTLSEQDSIQKSPLSE TMDVKP S DMTL PLALPLGT EDHHML LP V 
TDTQSPTSDTLLRSTVNGYSEPQLIFLQQLY 


6753 


2 


1305 


VPSLPYPPQKWAHTEFTTSSDSETANGIAKPDPVMPGGEEKAS ' 
PFGIKLRRTNYSLRFNCDQQAEQKKKKRHSSTGDSADAGPPAAG 
SARGEKEMEGVALKHGPSLPQERKQAPSTRRDSAEPSSSRSVPV 
AHPGPPPASSQTPAPEHDKAANKMPLAQKPALAPKPTSQTPPAS 
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ID 
NO: 


Predicted 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H«*Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W- Tryptophan, Y=Tyrosine, X= Unknown , *«Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLSKLSRPYLVELLSRRAGRPDPEPSEPSKEDQESSDRRPPSPP " 

GPEERKGQKRDEEEEATERKPASPPLPATQQEKPSQTPEAGRKE 

KPMbQSRHSLDGSKLTEKVETAQPLWITLALQKQKGFREQQATR 

EERKQAREAKQAEKLSKENVSVSVQPGSSSVSRAGSLHKSTALP 

EE KRPETAVSRLERREQLKKANTL PTS VTVE I SYSS PAAPLVKE 

VSKRFSSPDDAPVSSEPAWLALAKRKAKAWSDCPLIIK 


6754 


2 


1 413 


FVRRRRRRLGGPEVNTMSSLHKSRIADFQDVLKEPSIALEKLRE " 
LSFSGIPCEGGLRCLCWKILLNYLPLERASWTSILAKQRELYAQ 
FLREMIIQPGIAKANMGVSREDVTFEDHPLNPNPDSRWNTYFKD 
NEVLL 


6755 


298 


1343 


PGLQLQVALEADWFLDMPGGRRGPSRQQLSRSALPSLQTLVGGG " 
CGNGTGLRNRNGSAIGLPVPPITALITPGPVRHCQIPDLPVDGS 
LLFEFLFFIYLLVALFIQYINIYKTVWWYPYNHPASCTSLNFHL 
I D YHLAAF I TVMLARRLVWAL I SEATKAGAASM I H YMVL I SARL 
VLLTLCGWVLCWTLVNLFRSHSVLNLLFLGYPFGVYVPLCCFHQ 
DSRAHLLLTDYNYWQHEAVEESASTVGGLAKSKDFLSLLLESL 
KEQFNNATP I PTHSCPLSPDLIRNEVECLKADFNHRIKEVLFNS 
LFSAYYVAFLPLCFVKVSGYLTFMCFLDLCVNYINWVFLV 


6756 


180 


754 


IERALGSLPLS I PVSWGSLRTLKYQQQPLRPKVLLCQTRVQCHD 
LRSLQPQPPGLKQSFCLRVLGLQTGATTPGLRDLTCKELIILTE 
REAQKRKKRKEKESGMALTQGPLTFRDVAIEFSQEEWKSLDPVQ 
KALYWDVMLENYRNLVFLGKDNFALEVKICPRVFLYFLCCLSWE 
PFHYLTETEALLTHK 


6757 


2 


459 


NSRVEAPEAHSRESQGSDAMRKHLSWWWLATVCMLLFSHLSAVQ ' 
TRGIKHRIKWNRKALPSTAQITEAQVAENRPGAFIKQGRKLDID 
FGAEGNR YYEANY WQ FPDG I HYNGCS EANVTKE AFVTGCINATQ 
AANQGEFQKPDNKLHQQVLW 


675B 


1 


1008 


ASGPELPGRRFRDRAPWLPARLLRGVliAVWVSLSAliGPGSFCRR 
RVPSLAQLGHSEAAPSPDDVRWSRVPDRCPEERDRAWPPPPPPS 
LPPSFRRNMANNSPALTGNSQPQHQAAAAAAQQQQQCGGGGATK 
PAVSGKQGNVLPLWGNEKTMNLNPMI LTN I LSS PYFKVQLYELK 
TYHEWDE I YFKVTHVEPWEKGSRKTAGQTGMCGGVRGVGTGG I 
VSTAFCLLYKLFTLKLTRKQVMGLITHTDSPYIRALGFMYIRYT 
QPPTDLWDWFESFLDDEEDLDVKAGGGCVMTIGEMLRSFLTKLE 
WFSTLFPRIPVPVQKNIDQQIKTRPRKI 


6759 


1 


513 


RKHNFHSLDGTSTRAFHPQTGLPLLSSPVPQRKTQSGCFDLDSS 
LLHLKSFSSRSPRPCLNIEDDPDIHEKPFLSSSAPPITSLSLLG 
NFEESVLNYRFDPLGIVDGFTAEVGASGAFCPTHLTLPVEVSFY 
SVSDDNAPSPYMGVITLESLGKRGYRVPPSGTIQWCVL 


6760 


239 


606 


VLSKKKGLSAEEKRTRMMEIFSETKDVFQL3CDLEKIAPKEKGIT ' 

AMSVKEVLQSLVDDGMVDCERlGTSNYYWAFPSKALHARKflKLE 

VLESQLSEGSQKHASLQKSIEKAKIGRCETEERT 


67*1 


29 


1733 


ERTLRGLREVAAPSDVADAAVSRRGRCCCCLHCTQTQVAQDCPS 
SSSSVQRCELSLFQSLHTMTSKKLVNSVAGCADDALAGLVACNP 
NLQLLQGHRVALRSDLDSLKGRVALLSGGGSGHEPAHAGFIGKG 
MLTG V I AG AVFTS PAVG S I LAA I RAVAQ AGT VGTLL I VKN YTGD 
RLNFGLAR EQARAEG I PVEMWIGDDSAFTVLKKAGRRGLCGTV 
LIHKVAGALAEAGVGLEEIAKQVNWTKAMGTLGVSLSSCSVPG 
S KPTFELS ADE VELGLG I HGEAGVRR I KMATADE I VKLMLDHMT 
NTTNAS HVP VQ P GS S WMM VNNLGG LS FLE LG 1 1 ADAT VR S LEG 
RGVKIARALVGTFMSALEMPGlSIiTr.J.»LVDEPLLKLIDAETTAA 
AWPNVAAVS I TGRKRSRVAPAEPQEAPDSTAAGGSASKRMALVL 
ERVCSTLLGLEEHLNALDRAAGDGDCGTTHSRAARAIQEWLKEG 
PPPASPAQLLSKLSVLLLEKMGGSSGALYGLFLTAAAQPliKAKT 
SLPAWSAAMDAGLEAMQKYGKAAPGDRTMLDSLWAAGQEL ! 
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ID 
NO: 


Predicted 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D«Aspartic Acid, E« 
Glutamic Acid, F»Phenylalanine , G=Glycine, 
H^Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *^Stop 
Codon, /=possible nucleotide deletion, 
\*possible nucleotide insertion) 


6762 


3 


613 


ASTISWRLCVAGAEARRPVPVAGERAGGGAMWFMYLLSWLSLFI 
QVAF I TLAVAAGL Y YLAE L I E B Y T VATS R 1 1 KYM I WFS TAVL IG 
LYVFERFPTSMIGVGLFTNLVYFGLLQTFPFIMLTSPNFILSCG 
LVVVNHYLAFQFFAKEYYPFSEVLAYFTFCLWIIPFAFFVSLSA 
GENVLPSTMQPGDDWSNYFTKGKRGK 




2 


760 


S G PD F P GRRFRG CCC VR P P AGAGME LGGHWDMN S APRL VS ETAE 
RKQEQKTGTEAEAADSGAVGARRFLLCLYU5GFLDLFGVSMWP 
LLSLHVKSLGAS PTVAG I VGSS YG I LQLFS S TLVGCWSDWGRR 
S S LLACI LLSALGYLLLGAATNVFLFVLARVPAG I FKHTLS I SR 
ALLSDWPEKERPLVIGHFNTASGVGFILGPWGGYLTELEDGF 
YLTAFICFLVFILNAGLVWFFPRREAKPGSTE 


S764"-; 


80 


438 


LKKMDTMMLS VRNtiFEQLVRRVE ILS EGNEVQ F I QLAKD FEDFR 
KKWQRTDHELGKYKDLLMKAETERSALDVKLKHARNQVDVE I KR 
RQRAEADCEKLERQIQLIREMLMCDTSGSIQ 


6765 


3 


550 


ARYSRVDHFCRRRCRAVARAPRFLLQFPSGPSRHFLAACVARWL 
R GS VL VS EAL SGSAMDG I VTE VA VG VKRGS VELLSGSVLSS PNS 
NMS S M WTANGNDS K K F KG E D KMDG AP SR VLH I RKLPG E VTETE 
VIALGLPFGKVTNILMLKGKNQAFLELATEEAAITNGNYYSAVT 
PHLRNQ 




1 


12B7 


EGGSFKASLTWLWPLGEMKLHCEVEVISRHLPALGLRNRGKGVR 
AVLSLCQQTSRSQPPVRAFLLISTLKDKRGTRYELRENIEQFFT 
KFVDEGKATVRLKEPPVDXCLSKANSSSLKGFLSAMRLAHRGCN 
VDTPVSTLTPVKTSEFENFKTKMVITS KKDYPLS KNFPYSLEHL 
QTS Y CGLVR VDMRM LCLKS LRKLDLS HNH I KKL P ATI GDL I HLQ 
ELNLNDNHLESFSVALCHSTLQKSLWSLDLSKNKIKALPVQFCQ 
LQELKNLKLDDNELIQFPCKIGQLINLRFLSAARNKLPFLPSEF 
RNLS LE YLDLFGNTFEQP KVLPV I KLQAPLTLLESSART I LHNR 
I P YG SH I 1 PFHLCQDLDTAKI CV CGRFCLNS F I QGTTTMNLHS V 
AHTWLVDNLGGTEAPI I S YFCSLGCYVNSSDI 


6767 


336 


919 


APMICLCSSDLQFRYKEAFLRDRGLQIGYCSVDDDPRMKHFLNV 
GRLQSDNEYKKDFAFCSRSQFHSSTDQPGLLQAKRSQQLASDVHY 
RQPLPQPTCDPEQLGLRHAQKAHQLQSDVKYKSDLNLTRGVGWT 
PPGSYKVEMARRAAELANARGLGLQGAYRGAEAVEAGDHQSGEV 
NPDATE I LHVKKKKALLL 


6768 . 


2 


363 


PGS TI S C YLLS EGS LPLCMQ VACGEEKHRAPTMKTLRAR FKKTE 
LRLS P TD LGS CP P CGPC P I P KP AARGRRQS QD WG KS DER LLQAV 
ENNDAPRVAALIARKGLVPTKLDPEGKSAFHL 


6769 


284 


396 


MSTPDFSTAENNQELANBVSCLKAMLTLMLQAMGQAD 


6770 


1 


397 


QRNYQ V I W S S TMAKLHD YY KDE WKKLMT E FNYNS VMQ V PRVE K 
ITI^MGVGEAIADKKLLDNAAADLAAISGQKPLITKARKSVAGF 
KI RQG YP IGCKVTLRGERM WE FFERL I TIAVPRI RDFRGLS AKS 


6771 


3 


378 


APAGTLAM TGKS VXD VDR YQAVIiANLLLEEDNKFCADCQS KG PR 
WASWNIGVFICIRCAGIHRNLGVHISRVKSVNLDQWTQEQIQCM 
QEMGNGKANRLYEAYLPETFRRPQIDPYLFWSNLEG 


6772 


1 


1400 


aaaflqgmtvngfintvitsl\errydlhsyqsc5liassydiaa*- 

clcltfvsyfggsg\hkprwlgwgr\vlmgtgslvfalphftag 

p**gwkldagvrtcpanpr\pvcag\htsglsryqlvfmlgqfl 

hgvgatplytlgvtyldenvksscsp i y iai fytaailgpaagy 

liggallniytemgrrtelttesplwvgawwvgflgsgaaafft 

avpilgyprqlpgsqryavmraaemhqlkdssrgeasnpdfgkt 

irdlplsiwlllknptfillclagateatlitgmstfspkfles 

qfslsaseaatlfgylwpaggggtflggffvnklrlrgsavik 

fclfctwsllgilvfslhcpsvpmagvtasyggsllpeghlnl 

tapcnaacscqpehyspvcgsdglmyfslchagcpaatetnvdg 

qkvyrdcscipqnlssgfghatagkctst 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide" - 
{A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutaraine, R^Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6773 


1 


630 


PWEAPKEHKYKAEEHTWLTVTGEPCHFPFQYHRQLYHKCTHKG 
RPGPQPWCATTPNFDQDQRWGYCLEPKKVKDHCSKHSPCQKGGT 
CVNMPSGPHCLCPQHLTGNHCQKEKCFEPQLLRFFHKNEIWYRT 
EQAAVARCQCKGPDAHCQRLASQACRTNPCLHGGRCLEVEGHRL 
CHCPVGYTGPFCDVGE*GSGASRRPAPRWDGLAR 


6774 


146 


389 


LTELSDQQYFLFFILSS/WVPTFLSMDVDGRVIKADSFSKIISS 
GLR IGFLTGPKPLI ER V I LH I Q VS TLH P S T FNQLM I S Q 


6775 


104 


614 


T CPS QLRVLTARGGRRAPS PQLWTLVLAL I EE KWRSHR I LRMNS 
GRPETMENLPALYTI FQGEVAMVTD YGAF I KI PGCRKQGLVHRT 
HMS S CR VD KP S E I VD VGDKVW VKL I GRE MKNDR I KVS LS MKWN 
QGTGKDLDPNNV\SLSKKRGGGDPSRITLGRRSPLRLS 


6776 


3 


1108 


HERHERHEGALSQDALLRISIPLDSNMRPEKCRRFVHPQWQLLH 
LNGTFPNTSDADMEPCVDGWVYDRISFSSTIVTEWDLVCDSQSL 
TS VAKFVFMAGMMVGG I LGGHLSDRFGRRFVLRWCYLQVAI VGT 
CAALAPTFLIYCSLRFLSGIAAMSLITNTIMLIAEWATHRFQAM 
GITLGMCPSGIAFMTLAGLAFAIRDWHILQLWSVPYFVIFLTS 
SWLLESARWLIINNKPEEGLKELRKAAHRSGMKNARDTLTI^EIL 
KSTMKKELEAAQKKKPFLGERLHMPNICKRISLLPFTKFANFMA 
YFGLNLjHG/LKHLGNNVFLLQTLFGAV/TPPGQLVLHLGHWGSG 
RVS S RGRVNCLGLF VLQVW 


6777 


779 


63 


CFFHGPAWRDCEVRATFAKKQGQSGIISCIAFSPAQPLYACGSY 
GRSLGLYAWDDGSPLALLGGHQGGITHLCFHPDGNRFFSGARKD 
AELLCWDLRQSGYPLWSLGREVTTNQR I YFDLDPTGQFLVSGST 
SGAVSVWDTDGPGNDGKPEPVLSFLPQKDCTNGVSLHPSLPLLG 
HCLPVSVCFLSPTESGGRRRGAGPSLGSPRRHVHLECRLQLWWC 
GGGARLQHP * * SPRARKGR 


6778 


311 


805 


IQSITDESRGSIRRKNPANTRLRbNVP\EETAGDSE/ERSPEEE 
VQADPRIRSASPKCPTSSPFPKGRSPEGEGET\DPEKVHFHPGP 
KDKSVAEKN\KGP\SPVSSEGIKDFFSMKPEWENLNQSNVRRMH 
T \ AVRLNE VI VKKSRDAKLVLLNMPG PPRNRNGDENY 


6779 


2 


535 


RALRRQPRLLAANGIEPESMAISEPIKGSRKPCVNKEELiALKKP 
MAKCAWKG PRE P PQDARAE AES PGG ASES DQ DGGHES P P KKKAV 
AWVSAKNPAPMRKKKKVSL.GPVSYVIiVDSEDGRKKPVMPKKGPG 
SRREASDQKAPRGQQPAEATASTSRGPKAKPEGSPRRATNESRK 
V 


6780 


3 


403 


HEVNDNKPEININLMSPGKEEISYIFEGDPIDTFVALVRVQDKD 
SGLNGEIVCKLHGHGHFKLQKTYENNYLILTNATLDREKRSEYS 
LTVIAEDRGTPSLSTVKHFTVQINDINDNPPHFQR9RYEFVISE 
K 


6781 


1 


1269 


APTRPVFPTLQDIiSSSKEPSNSLNLPHSNELCSSLVHPELSEVS 
SNVAPSIPPVMSRPVSSSSISTPLPPNQITVFVTSNPITTSANT 
SAALPTHLQSALMSTWTMPNAGSKVMVSEGQSAAQSNARPQFI 
TPVFINSSSIIQVMKPSQPSTIPAAPLTTNSGLMPPSVAWGPL 
HIPQNIKFSSAPVPPNALSSSPAPNIQTGRPLVLSSRATPVQLP 
SPPCTSSPWPSHPPVQQVKELNPDEASPQVNTSADQNTLPSSQ 
STTMVS PLLTNS PGSSGNRRSPVS S S KGKGKVDKIGQILLTKAC 
KKVTGSLEKGEEQYGADGETEGQGLDTTAPGLMGTEQLSTELDS 
KTPTPPAPTLLKMTSSPVGPGTASAGPSLPGGALPTSVRSIVTT 
LVPSELISAVPTTKSNHGGIASESLAG 


6782 


3 


1327 


RKPTVIRIPAKPGKCLHEDPQSPPPLPAEKPIGNTFSTVSGKLS 
NVERTRNLESNHPGQTGGFVRVPPRLPPRPVNGKTIPTQQPPTK 
VPPERPPPPKLSATRRSNKKLPFNRSSSDMDLQKKQSNLATGLS 
KAKSQVFKNQDPVLPPRPKPGHPLYSKYMLSVPHGIANEDIVSQ 
NPGELS CKRGDVLVMLKQTENNYLECQKGEDTGRVHLSQMKL I T 
P LDEHLR S RPNP FS PPKAPS HAQK P VDSG APHAWLHDFPAEQ V 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine. I=Isoleucinp K— Tiv^lnf* 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /»possible nucleotide deletion, 
\=possible nucleotide insertion) 








DDLNLTSGEIVYLLEKIDTDWYRGNCRNQIGIFPANYVKVIIDI 
PEGGNGKRECVSSHCVKGSRCVARFEYIGEQKDELSFSEGEIII 
LKEYVNEE WARGE VRGRTG I FPLNFVE PVEDYPTS GANVLSTKV 
PLKTKKEDSGSNSQVNSLPAEWCEALHSFTAETSDDLSFKRGDR 
I 


6783 


3 


1750 


SYHHHHAQQSAAASPNLTASQKTVTTTSMITTKTLPLVIjKAATA 
TMPASWGQRPTIAMVTAINSQKAVLSTDVQNTPVNLQTSSKVT 

nPr?AKZXVnT\72iTrWT\7TT.OVn7XT'DDn'DT'ir\7Dr\T7TT5'DDDT TDOBMO 
\jtr unortvyx VrtXvLM 1 v ±Lt\i V^Ai cc\Jlr ± IV. V ryr J. if f fKLt 1 rKrNr 

LPQVRPKPVAQNNIPIAPAPPPMLAAPQLIQRPVMLTKFTPTTL 
PTSQNS I H P VR WNGQTAT I AKTF PMAQLTS IV 1 AT PGT RLAG P 
QTVQLSKPSLEKQTVKSHTETDEKQTESRTITPPAAPKPKREEN 
P QKLAFMVSLGLVTHDHLEE I QS KRQ ERKRRTTAN P V YS G A VF E 
PERKKSAVTYLNSTMHPGTRKRGRPPKYNAVLGFGALTPTSPQS 
SHPDS PENEKTETTFTFPAPVQPVSLPSPTSTDGDIHEDFCSVC 
RKSGQliLMCDTCSRVYHLDCLDPPLKT I PKGMW I CPRCQDQMLK 
KEE AI P W PGTLA I VHS Y I A YKAAKE E E KQKLLKWS S DLKQEREQ 
LEQKVKQLSNSISKCMEMKNTILARQKEMHSSLEKVKQLIRLIH 
GIDLS KPVDSEATVGAI SNGPDCTP PANAATST PAP S PS S QS CT 
ANCNQGEETK 


6784 


3 


1750 


SYHHHHAQQSAAASPNLTASQKTVTTTSMITTKTLPLVLKAATA 
TMPASWGQRPTIAMVTAINSQKAVLSTDVQNTPTOLQTSSKVT 
GPGAEAVQ I VAKNT VTLQVQATP PQP I KVPQF I P P PRLT PR PNF 
LPQVRPKPVAQNNIPIAPAPPPMLAAPQLIQRPVMLTKFTPTTL 
PTSQNS IH PVR WNGQTAT I AKTFPMAQLTS I VI ATPGTRLAGP 
QTVQLSKPSLEKQTVKSHTETDEKQTESRTITPPAAPKPKREEN 

tJAVT 7\ T7MT7GT C T T 7'T , IJr\T_IT TOPIAIAf Tl vn r> rprp >\ "vr-nx 7-1/ CO 7\ \ rrai-> 

Jb'yjvJUHriyiVbi-AjLtV lHlJHJbJsb±UbKi<QbKKK.Kl 1ANFVYSGAVFE 
PERKKSAVTYLNSTMHPGTRKRGRPPKYNAVLGFGALTPTSPQS 
SHPDSPENEKTETTFTFPAPVQPVSLPSPTSTDGDIHEDFCSVC 
RKSGQLLMCDTCSRVYHLDCLDPPLKT I PKGMWI CPRCQDQMLK 
KEEAI PW PGTLA IVHSYIAYKAAKE EE KQKLLKWS S DLKQEREQ 
L EQKVKQLSNS I S KCME MKNT I LARQKEMH S S LE KV KQL I RL I H 
GIDLS KPVDSEATVGAI SNGPDCTP PANAATSTPAPSPSSQSCT 
ANCNQGEETK 


678S 


1 


528 


LGNTVLHYCSMYSKPECLKLLLRSKPTVDIVNQAGETALDIAKR 
LKATQCEDLLSQAKSGKFNPHVHVEYEWNLRQEEIDESDDDLDD 
KPSPVKKERSPRPQSFCHSSSISPQDKLALPGFSTPRDKQRLSY 
GAFTNQIFVSTSTDSPTSPTTEAPPLPPRNAGKGPTGPPITPHR 


6786 


1820 


1397 


RS PKVLVIiAPTRELANHVSRDFKDI \TRKLTVARFYGGTS YQSQ 
I NHIRNG I D I LVGTPGRI KDHLQSGRLDLS KLRHWLDEVDQML 

riT.f3T?2VPOTrTrriT TUT?C vt^TT^OTyn'MDrvT'T t pct\ T'/rj/'M'irtrvPT tt\ \ w 
uukjc rtayvaui J.ntn> x )S.XiJZ)EiUx4r\^ 1 LiLtr oAiLi'yWVl J. VA\KK 

YMKSRYEQVDLDGKMTQKAATTVEHLA I QCH WSQRPAVI GDVLQ 

VYSGSEGRAIIFCETKKNVTEMAMNPHIKQNAQCLHGDIAQSQR 

£j x 1 iji\vjrKJio&c JWiiVAiiN VAnXxtiMJX V JiVUJjV Aybor'tryiJ V J5S 

YIHRSGRTGRAGRTGICICFYQPRERGQLRYVEQKAGITFKRVG 

VPSTMDLVKSKSMDAIRSLASVSYAAVDFFRPSAQRLIEEKGAV 

DALAAALAHISGASSFEPRSLITSDKGFVTMTLESLEEIQDVSC 

AWKELNRKLSSNAVSQITRMCLLKGNMGVCFDVPTTESERLQAE 

WHDS DW I LS VPAKL PE I EE Y YDGNTSSNSRQRSGWSSGRSGRSG 

RSGGRSGGRSGRQSRQGSRSGSRQDGRRRSGNRNRSRSGGHKRS 

FD * VF YHLVD FLS D FLVDS VYLTG RQ I DHLTG LTG L I DH LT S HS 

SVWN 


6787 


2646 


2270 


PSSFPKNVPLEELEEPPK*KRSGLGSLTPKSQIQNGP*PQTFFF ' 

FELGSPSGVISAHCNLRLLGSSDSPAPASRVAGIIGTCHHAWLI 

LVFLVEMGFHHVGQAGLKLLTL\VIHPPWPPKVLGLQT 


6788 


16 


936 


GGTVDLR\DMLAVSV1AAVRGGR/ATVRRVRESNVLHEKSKGKT 
REGAEDKMTSGDVLSNRKMFYLLKTAFPSVQINTEEHVD\ELDQ 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D»Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine , K>Lysine, 

u"u^uv<j.uCf i] — I'icnii viiiJjc f vi — nbUdCay XHE , 

P=Proline, QsGlut amine, R=Arginine, 
S-Serine, T^Threonine, V*= Valine, 
W=Tryptophan, Y«Tyroeine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








FTEDLSK\YVTTMVCVAVNGKPMLGVIHKPFSEYTAWAMVDGGS 
NVKARS S Y NE KT P R I WS RS HSGMVKQ VALQTFGNQTT 1 1 PAGG 
AGYKVLALLDVPDKSQEKADLYIHVTYIKKWDICAGNAILK7VLG 
GHMTTLSGEE I SYTGSDG 1 EGGLLAS I RMNHQALVRKLPDLE KT 
GHK 


6789 


2 


678 


GNGINVLKIAPESAIKFMAYEQIKRLVW**PGDS*GF/YERLVA 
GSLAGAIAQSSIYPMEVLKTRMALRKTGQYSGMLDCARRILARE 
GVAAFYKG YVPNMLGI IPYAG I DLAVYETLKNAWLQHYAVNSAD 
PGV F VLLACGTM S S TCGQLAS Y P LAL VR TRMQAQ AS I EGAP E VT 
MSSLFKHILRTEGAFGLYRGLAPNFMKVIPAVSISYWYENLKI 
TLGVQSR 


6790 


2 


4068 


AP PAG RRRMQAA P RAG CG AALLLW I VS S CLCRAWTAPS TS QKCD 
EPLVSGLPHVAFSSSSSISGSYSPGYAKINKRGGAGGWSPSDSD 
HYQWLQVDFGNRKQISAIATQGRYSSSDWVTQYRMLYSDTGRNW 
KP YHQDGN I WAFPGN.T NSDG WRHELQHP I IAR YVR I VPLDWNG 
EGRIGLRIEVYGCSYWADVINFDGHWLPYRFRNKKMKTLKDVI 
ALNFKTSESEGVILHGEGQQGDYITLELKKAKLVLSLNLGSNQIi 
GPIYGHTSVMTGSLLDDHHWHSWIERQGRSINLTLDRSMQHFR 
TNGEFD YLDLD YE I TFGG I P FSGKPS S SSRKNFKGCMES I NYNG 
VNITDLARRKKLEPSNVGNLSFSCVEPYTVPVFFNATSYLEVPG 
RLNQDLFSVSFQFRTWNPNGLLVFSHFADNIiGNVEIDLTESKVG 
VHINITQTKMSQIDISSGSGLNDGQWHEVRFLAKENFAILTIDG 
DEASAVRTNSPLQVKTGEKYFFGGFLNQMNNSSHSVLQPSFQGC 
MQLIQVDDQLVNLYEVAQRKPGS FANVS IDMCAI I DRCVPNHCE 
HGGKCSQTWDSFKCTCDETGYSGATCHNSIYEPSCEAYKHLGQT 
SNYYWIDPDGSGPLGPLKVYCNMTEDKVWTIVSHDLQMQTPWG 
YNPEKYSVTQLVYSASMDQISAITDSAEYCEQYVSYFCKMSRLL 
NTPDGSPYTWWVGKANEKHYYWGGSGPGIQKCACGIERNCTDPK 
Y YCNCDAD Y KQWR KDAG FLS Y KDH LP VS QVWGDTDRQGS E AKL 
SVG PLRCQGDRNYWNAASFPNPS S YLH FST FQGETS AD I SF Y FK 
l ui FViKsvr JjJMNMuKiilJr 1K.JjciLjK.oAI EVoFSF DVGNGPVEIVVR 
S PT PLNDDQWHRVTAERWKQASLQ VDRLPQQ I RKAP T EGHTRL 
ELYSQLFVGGAGGQQGFLGCIRSLRMNGVTLDLEERAKVTSGFI 
SGCSGHCTSYGTNCENGGKCLERYHGYSCDCSNTAYDGTFCNKD 
VGAFFEEGMWLRYNFQAPATNARDSSSRVDNAPDQQNSHPDLAQ 
LiLt ±t\v or o i 1 ivrtr u ±Liu x loor i i. Uc JLitWJjVKPToSIjQIRYNIjG 
GTR EP YNI D VDHRNMANGQ PHS VN I TRHEKT I FLKLDHYP S VS Y 
HLPSSSDTLFNSPKSLFLGKVIETGKIDQEIHKYNTPGFTGCLS 
RVQFNQIAPLKAALRQTNASAHVHIQGELVESNCXSASPLTLSPM 
SSATDPWHLDHLDSASADFPYNPGQGQAIRNGVNRNSAIIGGVI 
A\WIFTPSLCTP\VLP*SR*HVSPHKGTLPIPNEAKGAGSRQK 
KPG RR PSMNNDPPTS QR P IDES KKE W PHLRGG YLAMG 


6791 


1801 


1193 


TGHEGAKGEKGDKGDLGPRGERGQHGPKGBKGYPGIPPEL/PGW 
S AW* S WL TAASTKV0 A I LLPO P LE * r jG T .O I a pm a T . dth no 
NSG 1 1 FSSVETNIGNFFDVMTGRFGAPVSGVYFFTFSMMKHEDV 
EEVYVYLMHNGiWVFSMYSYEMKGKSDTSSNHAVIiKLAK^ 
LRMGNGALHGDHQRFSTFAGFLLFETK 


479* 


33 


1073 


VRHTNWGVDMYLFSLGSESPKGAIGHIVSTEKTILAVERNKVLL 
PPLWNRTFSWGFDDFSCCl^SYGSDKVIiMTFENLiAAWGRCLCAV 
CPSPTTIVTSGTSTWCVWELSMTKGRPRGLRLRQALYGHTQAV 
TCLAASVTFSLLVSGSQDCTCILWDLDHLTHVTRLPAHREGISA 
ITISDVSGTIVSCAGAHLSLWNVNGQPLASITTAWGPEGAITCC 
CLMEGPAWDTSQI I ITGSQDGMVRVWKT/VGCEDVCSWTASRRG 
APGSASKPKRPQVGEEPGLESRAGR*HCFDREAQQNQP\PVTAL 
AVS RNHT KLL VG DERGR I FC WSADG * E ERGS RG S GTT VPG 
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SEQ 
ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F*» Phenylalanine, G=Glycine, 
H-Histidine, I=Isoleucine , K^Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T-Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine f X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 


6793 


2340 


805 


GRKEANY \ YGSLTQAGTVSLGLDAEGQEVFVPFSAVLPMVAPND 
LVFDGWDISSLNLAEAMRRAKVLDWGLQEQLWPHMEALRPRPSV 
YIPEFIAANQSARADNIilPGSRAQQLEQIRRDIRDFRSSAGLDK 
VI VLWTANTERFCEVI PGLNDTAENLLRT I BLGLEVS PSTLFAV 
ASILEGCAFLNGSPQNTLVPGALELAWQHRVFVGGDDFKSGQTK 
VKS VL VD FL I GSGLKTMS I VS YNHLGNNDGENLS APLQFRS KEV 
SKSNWDDMVQSNPVLYTPGEEPDHCWIKYVPYVGDSKRALDE 
YTS E LMLGGTNTL VLHNT CEDS LLAAP I MLDLALLTEL CQR VS F 
CTDMDPEPQTFHPVLSLLSFLFKAPLVPPGSPWNALFRQRSCI 
ENILRACVGLPPQNHMLLEHKMERPGPSLKRVGPVAATYPMLNK 
KGPVPAATNGCTGDANGHLQEEPPMPTT*GPGHTVSRLFLPAAP 
HDPTLKAPTNKGRCHFSPPSTWGSWGL 


6794 


169 


1349 


DDVKRKPEASAH*EKPGPPSRPGVRGGRERAGGRGSHGARSCR\ 
EPAPPAPAPPEDHPDEEMGFTIDIKSFLKPGEKTYTQRCRLFVG 
NLPTDITEEDFKRLFERYGEPSEVFINRDRGFGFIRLESRTLAE 
IAKAELDGTI LKSRPLRIRFATHGAALTVKNLS PWSNELLEQA 
FSQFG P VEKA WWDDRGRATGKGF VEFAAKPPARKALERCGDG 
AFLLTTTPRPVIVEPMEQFDDEDGLPEKLMQKTQQYHKEREQPP 
R FAQPGTFE FE YAS RWKALDEMEKQQREQVDRN I REAKE KLEAE 
MEAARHEHQLMLMRQDLMRRQEELRRLEELRNQELQKRKQIQLR 
HEEEHRRREEEM I R HREQ EE LRRQQ EGFKPN YMEN YVCHFLR 


6795 


1740 


1010 


GPRRQTQVRDHELDSF*DWAAQETDCAQNSGERL*KGV/LENFS 
TMSKSAVKISLDLLSNPLCEQDQDLLNMVTALDTAMFCRMDAFNQ 
EKVNQ I QKT V I E PLKKFGS VFPS LNMAVKRREQALQDYRRLQAK 
VE K Y E E KEKTG P VIaAKLHQAREELR P VR EDFE AKNRQLLE EM PR 
FYGSRLDYFQPSFESLIRAQWYYSEMHKIFGDLSHQLDQPGHS 
DEQRERENEAKLSELRALS IVADD 


6796 


48 


683 


GKEIQIPTI KLAWLLFGL E * P VGALG KG WS F * * SHVALGQLGW 
LTRAVRSSWRWELCVSAQEWSQRSA*SSPSPVGACPSLNPPET 
SVQEGRDCWQR+LPRLFSALVGQPGCWPQGAPPERCV*PGRCKW 
HLQSQVLR*ERRRCCRCLPRFA*GWRRRHQRLGLGIHPAPLGST 
S PPHPEGNSQQCRR * GWAAELRLPS S WL* GKLG C * 


6797 


1620 


211 


TERMTPSQPTRGSSCTRFSSMLWTSTWRCLTCHWAGMRMSVVGV 
TLGPMAQGLLSASGTTTEATWTRPTTHLTLIRWWLLTASRVDPP 
ERPPPPPSDDLTLLESSSSYKNL/DAQIPQ/DWSMSPSTSG*RP 
LTSRASSIMRSRTAIPSAS*SRLTTKHTVGGSPSAWRPRPTSRS 
VSTPVSSSTETTASGSCLTWWSSSPAPCPSSSAPAHSFEASCCK 
TSLWGS CGGSGDGSSACGSGWNLSMAGTSCSSPAMCS PSRAPS * 
RSASRPRTWRATTSAASSWAPRRCWCGWA*SAT*PSSTTTISSS 
PHCGWPCPASCASAAAWLSSTWATASVAGSCWGPIM*SSAHSPW 
CLSACSRSSMGTTCL*RSPP\SGASRAAAAWCX3SSPSSTFTPSS 
ASSSTWCSASSSRSS PAPTTPSS I PAAQAQRRASCRPTSHSART 
APPPAS SAAGAAR PAAFS AAAEG TPRRS I RCW 


6798 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEY1IGFCDQINKELEG*VS 
ALWGQLRGSGLGRGTTMAKEGQPGSPRLSALECVLLVPQ\PQIA 
VRLLAHKIQSPQEWEALQALTYLGDRVSEKVKTKVIELLYSWTM 
ALPEEAKIKDAYHMLKRQGIVQSDPPIPVDRTLIPSPPPRPKNP 
VFDDEEKSKLLAKLLKSKNPDDLQEANKLIKSMVREDEARIQKV 
TKRLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTLFKLASETEDNDNSLGDILQASDNLSRVINSYKTIIEG 
QVINGEVATLTLPDSEGNSQCSNQGTLIDLAELDTTNSLSSVIiA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
SWLDEELLCLGLADPAPNVPPKESAGNSQWHLLQREQSDLDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APSAGSSLFSTGVAPALAPKVEPAVPGHHGLALGNSALHHLDAL 
DQLLEEAKVTSGLVKPTTSPLIPTTTPARPLLPFSTGPGSPLFQ 
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SEQ 
ID 
NO; 


Predicted 
beginning 
mini poh i r^p 

location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C^Cysteine, D=Aspartic Acid, E=s 
oiuuainic nciu, r-rncuyxaXoniuci u=kjxycxne / 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=sAsparagine , 
P=Proline, QsGlutamine, R=Arginine, 
S=Serine, T=Threonine, V*Valine, 
W«Tryptophan, Y«Tyrosine, X°Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PLS FQSQGS PPKG PELSLAS I HVPLE S I KPS SALP VTAYDKNGF 
RILFHFAKECPPGRPDVLVVWSMLNTAPLPVKSIVLQAAVPKS 
MKVKLQPPSGTELSPFSPIQPPAAITQVMLLANPLKEKVRLRYK 
LTFALGEQLSTEVGEVDQPPPVEQWGNL 


6799 


3894 


1696 


STISWESLESWLNKATNPSNRQEDWEYIIGFCDQINKELEG*VS 
ALWGQLRGSGLGRGTTMAKEGQPGSPRLSALECVLLVPQ\PQIA 
VRLLAHKIQSPQEWEALQALTYLGDRVSEKVKTKVIELLYSWTM 
ALPEEAKIKDAYHMLKRQGIVQSDPPIPVDRTLIPSPPPRPKNP 
VFDDEEKSKLLAKLLKSKNPDDLQEANKLIKSMVREDEARIQKV 
TKRLHTLEEVNNNVRLLSEMLLHYSQEDSSDGDRELMKELFDQC 
ENKRRTLFKLASETEDNDNSLGDILQASDNLSRVINSYKTIIEG 
QVINGEVATLTLPDSEGNSQCSNQGTLIDLAELDTTNSLSSVLA 
PAPTPPSSGIPILPPPPQASGPPRSRSSSQAEATLGPSSTSNAL 
SWLDEELLCLGLADPAPNVPPKESAGNSQWHLLQREQSDLDFFS 
PRPGTAACGASDAPLLQPSAPSSSSSQAPLPPPFPAPWPASVP 
APS AGS S L FSTGVAP ALAPK VE PAVPGHHGLALGNS Al,HHT.,nAL 
DQLLEEAKVTSGLVKPTTSPLIPTTTPARPLLPFSTGPGSPLFQ 
PLS FQSQGS PPKG PELS LAS I HVPLES I KPS SALPVTAYDKNGF 
RILFHFAKECPPGRPDVLWWSMLNTAPLPVKSIVLQAAVPKS 
M KV KJjQ P PS G TE L S P FS P I Q P PAAI TQ VM LLANPLKE KVRLR Y K 
LTFALGEQLSTEVGEVDQFPPVEQWGNL 


6800 


404 


1646 


RRSPSTGLSPVPQPSSPSLSDYSIPWSLLLSGTIAWATPGK*AG 
* PQAW* LGLAPAIAF I /GLTRGRKQNKEKMAEGGSGDVDDAGDC 
SG AR YNDWSDDDDDSNES KS I VWYPPWAR IGTEAGTRARARARA 
RATRARRAVQKRASPNSDDTVLSPQELQKVLCLVEMSEKPYILE 
AAL I ALGNNAAYAFNRD I IRDLGGLP IVAKI LNTRDP I VKEKAL 
IVLNNLSVNAENQRRLKVYMNQVCDDTITSRLNSSVQLAGLRLL 
TNMTVTNE YQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAE 
NPAMTRELLRAQVPSSLG\SLFNKKENKEVILKLLVI FENINDN 
FKWEENE PTQNQFGEGSL FF FLKE FQ VCADKVLG IESHHDFL VK 
VKVGKFMAKLAEHMFPKSQE 


6801 


2 


1755 


SAEEFESQOASVTMHDVDAESFEVLVDYCYTGR^SLSEANVERL 
YAASDMLQLE YVREACAS FLARRLDLTNCTAI LKFADAFG HRKL 
RSQAQSY I AQNFKQLSHMGS I REETLADLTLAQLLAVLRLDSLD 
VESEQTVCHVAVQWLEAAPKERGPSAAEVFKCVRWMHFTEEDQD 
YLEGLLTKPIVKKYCLDVIEGALQMRYGDLLYKSLVPVPNSSSS 
/ R * QQQLS C I CSR KS TPETG YVCQGDGDLLWTPQRSLS \ R YDP Y 
SGDIYTMPSPLTSFAHTKTVTSSAVCVSPDHDIYLAAQPRKDLW 
VYKPAQNS WQQLADRLLCREGMDVAYLNG YI Y ILGGRDP I TGVK 
LKEVECYSVQRNQWALVAPVPHSFYSFELIWQNYLYAVNSKRM 
LCYDPSHNMWLNCASLKRSDFQEACVFNDEIYCICDIPVMKVYN 
PARGEWRRISNIPLDSETHNYQIVNHDQKLLLITSTTPQWKKNR 
VTVYEYDTREDQWINIGTMLGLLQFDSGFICLCARVYPSCLEPG 
QSFITEEDDARSESSTEWDLDGFS ELDS BSGSSSSFS DDE VWVQ 
VRPnT?Nanr»nnf5QT. 

V t\ e \l JvLNLtt \i U\Ji W w» O Li 


6802 


157 


1341 


ETFPLFFFLLSKTPGKTASMAHFVQGTSRMIAAESSTEHKECAE 
P S TRKNLMNS LEQ K I R CLEKQR KELLEVNQQ WDQQFRS M KEL YE 
RKVAELKTKLDAAERFLSTREKDPHQRQRKDDRQREDDRQRDLT 
RDRLQREEKEKERLNEELHELKEENKLLKGKNTLANKEKEHYEC 
E I KRLNKALQDALN I KCS FS EDCLRKSRVE FCHEEMRTEMEVLK 
QQVQIYEEDFKKERSDRERLNQEKEELQQINETSQSQLNRLNSQ 
IKACQMEKEKLEKQLKQMYCPPCNCGLVFHLQDPWVPTGPGAVQ 
KQREHPPDYQWYALDQLPPDVQHKAN/DWCLAPPPVCCQAG/PR 
TPGLK*SSCLWLPKC*NFRFILSKESPSVEVHTNRERQQATRER 
G ! 


6803 


1 


2203 


KLSGRPYRHMGVLGTSKLYDIRKTIFTFTPQFIDQQQFYLALDN 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D=Aspartic Acid, B= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R-Arginine, 
S=Serine, T-Threonine, V=Valine, 
W-Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








KMIVEMLRTDLSYLCSRWRMTGQPTITFPISHSMLDEDGTSLNS 
S I liAALRKMQDGYFGGARVQTGKLSEFLTTSCCTHLS FMDPGPE 
GKLYSEDYDDNYDYLESGNWMNDYDSTSHARCGDEVARYLDHLL 
AHTAPHPKLAPTSQKGGLDRFQAAVQTTCDLMSLVTKAKELHVQ 

Q S GE VD F KAL VLQLKE TSS LQ E QAD I L YM L YTMKG P DWNTELYN 
ERSATVRELLTELYGKVGEIRHWGLIRYISGILRKKVEALDEAC 
TDLLSHQKHLTVGLP PE PREKT I S APLP YEAL TQL I DEAS EGDM 
S I S I LTQE I M VYLAM YMRTQPG L FAEM FRLR I G L I I Q VMATE LA 
HSLRCSAEEATEGLMNLSPSAMKNLLHHILSGKEFGVERK/SVR 
PTDS NVS PAI S IH E I GAVGAT KTERTG I MQL KS E I KQ VE FRR LS 
ISAESQSPGTSMTPSSGSFPSAYDQQSSKDSRQGQWQRRRRLDG 

AT iMT?\7PVO PVA V\7W WT .O KCUn T . G \7IJV^ T7T fT DOO TTD T7MTn^TTTT/ 

f\i-UM avrvur lyr^v wiv v jj^M,rlujjo vcajc V XtrOO X X KHM 1 PCjHI K 

FSVHVES\VLNVLLRPEYRQLLVEAILVLTMLADIEIHSIGSII 
AVEKIVHIANDLFLQEQKTLGP\DDTMLAKDPASG\ I CTLR\YD 
SAPSGRFGTMTYLS \RAA\ATYVQEFLP\HS ICAMQ 


6804 


1 


951 


GSPGKKEEKAKNKESLCMENSSNSSSDEDEEETKAKMTPTKKYN 
GLEEKRKSLRTTGFYSGFSEVAEKRIKLJJWSDERLQNSRAKDR 
KDVWSSIQGQWPKKTLKELFSDSDTEAAASPPHPAPEEGVAEES 
LQTVAEEESCSPSVELEKPPPVNVDSKPIEEKTVEVNDRKAEFP 
SSGSNFSA*IPLPYLHLNRLHQSL*QKGSRQQSSVTVSEPLAPN 
QEEVRS I K5ETDSTI EVDS VAGELQDLQSERE * LASRF* CQCEL 
KQ * * SARTRTS * KS LYRS EKSE RCSGRRKFI KKAEKKP * SNSGK 
\j y ivii \s i\.Krnv. 


680B 


1539 


206 


RQ P DL KY FG KS FD VS VS E S S S L LS NDLP KFAD G I KARNRNQNYL 
VPSPVLRILDHTAFSTEKSADIVICDEECDSPESVNQQTQEESP 
I E VHTAEDVP I AVEVHAI S ED YD I ETENNSS E SLQDQTDEEPPA 
KLCKILDKSQALNVTAQQKWPLLRANSSGLYKCELCEFNSKYFS 
DLKQHMILKHKRTDSNVCRVCKESFSTNMLIiIEHAKLHEEDPYI 
CKYCDYKTVIFENLSQHIADTHFSDHLYWCEQCDVQFSSSSELY 
LHFQEHSCDEQYLCQFCEHETNDPEDLHSHWNEHACKLIELSD 
KYNNGEHGQYSLLSKITFDKCKNFFVCQVCGFRSRLHTNVNRHV 
AIEHTKIFPHVCDDCGKGFSSMLE\IAKHLNSHLSEGIYLCQYW 
EYSTGQIEDLKIHLDFKHSADLPHKCSDCLMRFGNERELISHLP 
VHETT 


6B06 i 


272 


3794 


VALCFPNSDPVMFMDAFYGCLLAELGPVPIEVPLTRKDAGSQQV 
GFLLGSCGVFLALITDACQKGLPKAQTGEVAAFKGWPPLSWLVI 
DGKHLAKP P KDWH PLAQDTGTGTAY I E YKTS KEGS TVG VTVS HA 
SLLAQCRALTQACGYSEAETLTNVLDFKRDAGLWHGVLTSVMNR 
MHWSVPYALMKANPLSWIQKVCFYKARAALVKSRDMHWSLLAQ 
RGQRDVSLSSLRMLIVADGANPWSISSCDAFLNVFQSRGLRPEV 
ICPCASSPEALTVAIRRPPDLGGPPPRKAVLSMNGLSYGVIRVD 
TEE KLS VLT VQDVGQVM PG ANVCWKLEGTP YL CKTDE VGE I CV 
S S S ATGTA YYGLLG ITKNV FE AV P VTTGGAP I FDR P FTRTG LLG 
FIGPDHLVFIVGKLtt5L^4VTGVPT^HMAnnwaTaT.a\7PDMW'P\7V 

RGRIAVFSVTVLHDDRIVLVAEQRPDASEEDSFQWMSRVLQAID 
S IHQVGVYCLALVPANTLP KAPLGG I H I SETKQR FLEGTLHPCN 
VLM C PHTCVTNL P KPRQ KQ P E VG PAS M I VGNL VAG KR I AQAS GR 
ELAHLEDSDQARKFLFLADVLQWRAHTTPDHPLFLLLNAKGTVT 
STATCVQLHKRAERVAAALMEKGRLSVGDHVALVYPPGVDLIAA 
FYGCLYCGCVPVTVRPPHPQNLGTTLPTVKMIVEVSKSACVLTT 
QAVTRLLRSKEAAAAVDIRTWPTILDTDDIPKKKIASVFRPPSP 
D VLAYLDFS VS TTG I LAG VKMS HAATS ALCRS I KLQ CEL Y P S RQ 
IAICLDPYCGLGFALWCLCSVYSGHQSVLVPPLELESNVSLWLS 
AVS Q Y KAR VTFC C Y S VM EMCTKGLGAQTG VLRMKG VNLS C VRTC 
MWAEERP\RIALTQSFSKLFKDLGLPARAVSTTFGCRVNVAIC 
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ID 
NO: 


Predicted 

beginning 

nucleotide 

location 

cor re s ponding 

to first 

amino acid 

residue of 

amino acid 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, K= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine / I=Isoleucine, K=Lysine, 

LrrTiRlini 71F» M— Mp» f h 1 nn i no N — H qtms ran "i r»«» 

P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W«Tryptophan, Y-Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








LQGTAGPDPTTVYVDMRALRHDRVRLVKKGSPHSLPLMESGKIL 
PGVKVI IAHTETKGPLGDSHLGE I WVC5 ^ PHNATOY YTVVT:*? J7&T. 
HADHFS ARLS FGDTQT I WARTG YLGFLRRTELTDASGGRHDAL Y 
WGSLDETLELRGMRYHPIDIETSVIRAHRSIAECAVFTWTNLL 
WWELDGLEQDALDL VALVTNWLEBHYLWG VWI VDPGVI P 
INSRGEKQRMHLRDGFLADQLDPIYVAYNM 


6807 


1444 


606 


VGHDTVHAMFTCFPKCLGFSPPVNVTVSPRSEESHTTTVSGGNG 
SVFQAGPQLQALANLEARRGSIGAALSSRDVSGLPVYAQSGEPR 
RLTQAQVAAFPGENALEHSSDQDTWDSLRSPGFCSPLSSGGGAE 
S LP PGGPGHAEAGHLGKVCDFHLNHQQPS PTS VLPTE VAAPPLE 
K I L S VDS VAVDCAYRTVP KPG P Q PG PHGS LLTEGCLRS L S GDLiN 

RFPCGMEVHSGQRELESWAVGEAMA\LKFPMGAMSYCLRDRSR 
PTiPRT.PMnr.^ no T.n\Trs 


6808 


2063 


737 


GVGSGAASALARSRPLASRLSSRRRTRAPRSGAMQRIAMDLRML 
SRELSLYLEHQVRVGFFGSGVGLSLILGFSVAYAFYYLSSIAKK 
PQLVTGGESFSRFLQDHCPWTETYYPTVWCWEGRGQTLLRPF\ 
ITSKPPVQYRNELIKTADGGQISLDWFDNDNSTCYMDASTRPTI 
LLLPGLTGTSKESYILHMIHLSEELGYRCWFNNRGVAGENLLT 
PRTYCCANTEDLETVIHHVHSLYPSAPFLAAGVSMGGMLL.I1NYL 
LjiMutJivi JrjjFiAAAJ. r 0 VQjWNI r ACSESLEKPLNWLLFNYYLTTC 
LQSSVNKHRHMFVKQVDMDHVMKAKSIREFDKRFTSVMFGYQTI 
DDYYTDASPS PRLKSVG I PVLCLNSVDDVFS PSHAIP I ETAKQN 
PNVALVLTS YGGHIG FLEG I WPRQS T YMDR VFKQ FVQAM VEHGH 
ELS 


6809 


939 


65 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTDSQPLHPSDPTEKQQPKRIiHVSNIPFRFRDPDLRQMF 
uyroMbUvliJ, l r WtiKuo Ktjr v»r VTF ETSSDADRAREKLNGTIV 
EGRKIEVNNATARVMTNKKTGNPYTNGWKLNPWGAVYGPEFYA 
VTGFPYPTTGTAVAYRGAHLRGRGRAVYNTFRAAPPPPP I PTYG 
AWYQDGFYGAE I \LEATQPTDTLS PLQRRQPTATVTAESTQLP 
TRT I TPSG PRRPTALE PCETFHRFLLGP 


6810 


939 


*5 


DYSGQTPVPTEHGMTLYTPAQTHPEQPGSEASTQPIAGTQTVPQ 
TDEAAQTDSQPLHPSDPTEKQQPKRLHVSNIPFRFRDPDLRQMF 
«V r vivAuu'visjLXf JNiiKUgKuryrV i r alooUAi^KAKCtKIjNGTlV 
EGRKIEVNNATARWTNKKTGNPYTNGWKLNPWGAVYGPEFYA 
VTGFPYPTTGTAVAYRGAHLRGRGRAVYNTFRAAPPPPPIPTYG 
AWYQDGFYGAE I \LEATQPTDTLS PLQRRQPTATVTAESTQLP 
TRTITPSGPRRPTALEPCETFHRFLLGP 


6811 


1522 


658 


DLVTVWSFVDCRVIASTHGH\KSWVSWAFDPYTTSVEEGDPME 
FSGSDEDFQDLLHFGRDRADSTQCRLSRRNSTDSRPVSVTYRFG 
SVGQDTQLCLWDLTEDILFPHQPLSRARTHTNVMNATSPPAGSN 
GNSVTTPGNSVPPPLPRSNSLPHSAVSNAGSKSSVMDGAIASGV 
S KFATL S LHDR KERHHE KDH KRNHS MGHI SS KS S D KLNLVTKTK 
TDPAKTLGTPLCPRMEDVPLLEPLICKKIAHERLTVLIFLEDCI 
VTACQEGFICTWGRPGKWSFNP 


6812 


4001 


1<?&2 


EDAVFSLDLSTI I QG TW FLNGEELKSNEPEGQ VE PGALR YR I EQ 
KGLQHR L I LHAV KHQDSG AL VG FS C P G VQDSAALT IQESPVHIL 
SPQDKVSLTFTTSERWLTCELSRVDFPATWYKDGQKVEESELL 
WKMDGRKHRLILPEAKVQDSGEFECRTEGVSAFFGVTVQDPPV 
HIVDPREHVFVHAITSECVMLACEV\DR\EDAPVRWYKDGQEVE 
ESDFWLENEGPHRRLVLPATQPSDGGEFQCVAGDECAYFTVTI 
TDVSSWIVYPSGKVYVAAVRLERWLTCELCRPWAEVRWTKDGE 
EWESPALLLQKEDTVRRLVLPAVQLEDSGEYLCEIDDESASFT 
VTVTEPPVRIIYPRDEVTIilAVTLECWLMCELSREDAPVRWYK 
DGLEVEESEALVLERDGPRCRLVLPAAQPEDGGEFVCDAGDDSA 
FFTVTVTEPPVQFLALETTPSPLCVAPGEP\A^LSCELSRAGAPV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 

PsPTnl inp OaGlutaminp P-ZirrrinSno 

S=Serine, T=Threonine, V=* Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, +-Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








VWSHNGRPVQEGEGLELHAEGPRRVLCIQAAGPAHAGLYTCQSG 
AAPGAPSLSFTVQVAEPPVRWAPEAAQTRVRSTPGGDLELVVH 

GEYLCDAPQDSRIFLVSVEEPLLVKLVSDLTPLTVHEGDDATFR 
CEVS P PDADVTWLRNGAWTPGPQRQSCCS YGGCRMCGQRKART 
CVSKWRQAEWVQRGPCAGCEVGSPCPTTLACPWPRMGTSTASSS 
MVSYWPTRAPTAARATTIAPWPGSA 


6813 


9 


836 


SSTQQRPGVPAGPRPLDGYLGVADHKPLKMHCRDCALVTSSGHL 

TiH^ROG^OTnnTFHVTRNTNinZXPTPnVf^PnVPlS.TPTGT DVTaUCGT 

QRILRNPJ^DLL^7VSQGTVFIFWGPSSYMRRDGKGQVYNNLHLLS 
QVLPRIjKAFMITRHKMLQFDELFKQETGQXNRKISNTWLSTGWF 

TMTT AT.FTiCTVR TMVYfiMnPPnT?r*PnDKrHDC\rpvvivv'irDC , riDnr , n 
in i ir\uc«i-iv»i^i\ jlw v i vjrlvj cue \,t\Uirc*t\iro V tr I fix X firr (JirLJ£it» 

TMYLSHERGRKGSHHRFITEKRVFKNWARTFNIHFFQPDWKPES 
LAINHPENKPVF 


6814 


3 


737 


KFRRQEAN / ARERNRMHGLNDALDNLRKWPC YS KTQKT..S KIET 
LR L AKNY I W A hS E I LR I G KR P DLLT F VQNLCKGLS Q PTTN LVAG 
CLQLNARSFLMGQGGEAAHHTRSPYSTFYPPYHSPELTTPPGHG 
TLDNSKSMKPYNYCSAYESFYESTSPECASPQFEGPLSPPPINY 

Mf3T FQT.lfOPPT'T r>V^TTMVMVr , MtJVr , At7'DT3t3r'"DT P^PnuoDT nTn 
lnox r o i_ii\.v^iLci i jjijz VjrJMN x IN X tirlrl X v tr c JrJjtjy vjAM r KLtcr 1 JJ 

SHFPYDLHLRSQSLTMQDELNAVFHN 


6815 


906 


553 


QGLDPASQTKWELLKDGSGRRGDRRSSRDMAGGAGPRSESDLE 
DVGPTAEWNGDGSGSLRRSGSFGKLRDALRRSSEMLVKKLQGGT 
PQEPPNPRMKRASSLNFLNKSVEEPTQPGG 


6816 


1 


803 


NLtKTHKF\LLGQDEDSLHSVPVAQMGNYOEYLKTLASPLREID 
PDQPKRLHTFGNPFKQDKKGMMIDEADEFVAGPQNKVKRPGEPN 
SPMSSKRRRSMSLXjLRKPQTPPTVTNHVGGKGPPSASWFPSYPN 
LI K P TL VHTD AT 1 1 HDGH E E KMENGQ I TPDG FLS KS APS E L I NM 
TGDLMPPNQVDSLSDDFTSLSKDGLIQKPGSNAFVGGAKNCSLS 
VDDQKD PVASTLGAMPNTLQI TPAMAQGINAD I KHQLMKE VRKF 
GRSK 


6817 


172 


3457 


LGMMDSPKIGNGLPVIGPGTDIGISSLHMVGYLGKNFDSAKVPS 
DEYCPACKEKGKLKALKTYRISFQESIFLCEDLQCIYPLGSKSL 
NNLISPDLEECHTPHKPQKRKSLESSYKDSLLLANSKKTRNYIA 
I DGGKVIiNSKHNGEVYDETS SNLPDSSGQQNP I RTADSLERNE I 
LEADTVDMATTKDPATVDVSGTGRPSPQNEGCTS KLEMPLES KC 
TSFPQALCVQWKNAYALCWLDCILSALVHSEELKNTVTGLCSKE 
ESIFWRLLTKYNQANTLLYTSQLSGVKDGDCKKLTSEIFAEIET 
CLNEVRDEIFISLQPQLRCTLGDMESPVFAFPLLLKLETHIEKL 
FLYSFSWDFECSQCGHQYQInTIHMKSLVTFTNVIPEWHPLNAAHF 
GPCNNCNS KS QI RKMVLE KVS P I FMLHFVEGLPQNDLQH YAFHF 
EGCLYQITSVIQYRANNHFITWILDADGSWLECDDLKGPCSERH 
KKFEVPASEIHIVIWERKISQVTDKEAACLPLKKTNDQHALSNE 
KPVSLTSCSVGDAASAETASVTHPKDISVAPRTLSQDTAVTHGD 
HLLSGPKGLVDNILPLTLEETIQKTASVSQLNSEAFL\LENKPV 
AENTGILKTNTLLSQESLMASSVSAPCNEKLIQDQFVDISFPSQ 
WNTNMQSVQLNTEDTVNTKSVNNTDATGLIQGVKSVEIEKDAQ 
LKQFLTPKTEQLKPERVTSQVSNLKKKETTADSQTTTSKSLQNQ 
SLKENQKKPFVGSWVKGLISRGASFMPLCVSAHNRNTITDLQPS 
VKGVNNFGGFKTKGINQKASHVSKKARKSASKPPPISKPPAGPP 
SSNGTAAHPHAHAASEVLEKSGSTSCGAQLNHSSYGNGISSANH 
EDLVEGQIHKLRLKLRKKLKAEKKKLAALMSSPQSRTVRSENLE 
QVPQDGS PNDCES IEDLLIiELP YP IDIANESACTTVPGVSLYSS 
QTHEEILAELLSPTPVSTELSENGEGDFRYLGMGDSHIPPPVPS 
E FND VS QNTHLRQDHN YCS P T KKN P C E VQ PDS LTNNACVRTLNL 
ESPMKTDIFDEFFSSSALNAIiANDTLDLPHFDEYLFENY 


6818 


2 


240 


RGFDKVLWT/LSGAVK\CVQFSRISPDGEEGYPGELKVWVTYTL 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
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corresponding 
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amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - 
(A^Alanine, C=Cysteine, D«Aspartic Acid, Ea 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , VeValine, 
WsTryptophan, Y= Tyro sine, X=Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








dgge/lhs/attehkp/vqatpvnlt\tiltstwqarlpqi 


6819 


1 


961 


GIPCTEMGNFDNANVTGEIEFAIHYCFKTHSLEICIKACKNLAY 
GEEKKKKCNPYVKTYLLPDRSSQGKRKTGVQRNTVDPTFQETLK 
YQVAPAQLVTRQLQVS VWHLGTLARRVFLGFVT T dt aTwnppne 

TTQSFRWHPLRAKADKYEDSVPQSNGELTVRAKLVLPSRPRKLQ 
EAQEGTDQPSLHGQLCLVVLGAKNLPVRPDGTLNSFVKGCLTLP 
DQQKLRLKSPVLRKQACPQWKHSFVFSGVTPAQLRQSSLELTVW 
DQALFGMNDRLLGGT\RLGSKGDTAVGGDACSQSKLQWQKVLSS 
PNLWTDMTLVLH ' 


6620 


1014 


340 


GDMVYIVGHVPPGFFEKTQNKAWFREGFNEKYLKWRKHHRVIA 
GQFFGHHHTDS FRMLYDDAGVP I SAMFITPGVTPWKTTLPGWN 
GANN PA I R VFE YDRATLS LKDMVTY FMNLS QANAQGTPR WELE Y 
QLTEAYGVPDASAHSMHTVLDRIAGDQSTLQRYYVYNSVSYSAG 
VCDEACSMQHVCAMRQVDIDAYTTCLYASGTTPVPQLPLLLMAL 
LGLCT 


6821 


1088 


518 


BFDIYR/EVGGEFVPVTRDDSSN6PPRTQHGPSPTVHPIQSPQN™ 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
FFAFSLIEGYI\SIVMDAETQKKFPSDLLLTSSSGELWRMVRIG 
GQPLG FDE CG I VAQ I AGPLAAAD I S AYYISTFNFDHALVPEDG I 
GSVIEVLQRRQEGLAS 


6822 


1088 


518 


EFDIYR/EVGGEFVPVTRDDSSNGFPRTQHGPSPTVHPIQSPQW 
RFCVLTLDPETLPAIATTLIDVLFYSHSTPKEAASSSPEPSSIT 
F FAFSL IEG Y I \ S IVMDAETQKKFPS DLLLTS S SGELWRM VRIG 
GQ PLG FDE CG I VAQ I AG P LAAAD I S AYY I S T FN FDHALVP EDG I 
GSVIEVLQRRQEGLAS ! 


6823 


654 


221 


PPKLLSRWARMGHGDBIV\LSDLNFPGLLHLPWGPWRSVQTAC 
GIPQLLEAVLKLLPLDTYVESPAAVMELVPSDKERGLQTPVWTE 
YESILRRAGCVRALAKIERFEFYERAKKAFAWATGETALYGNL 
ILRKGVLALNPLL 


6824 


858 


104 


LLLAQRWGWG\CCFFSLAVSVKMNVLLFAPGLLFLLLTQFGFRG 
ALPKLG I CAGLQWLGLPFLLENPSGYLSRS FDLGRQFLFHWTV 
NWRFL PEALFLHRAFHLALLTAHLTLLLLFALCRWHRTGES I LS 
LLRD P S KR KVP PO P LT PNO I VSTL FTS Nf P TG T C P q T? Q T H VCi x? v\; 
WY FHT L P YLLW AM P ARWLTHL LRLL VLGL I ELS WNT Y PS TS CS S 
AALHICHAVILLQLWLGPQPFPKSTQHSKKAH 


6825 


3 


1173 


SSGEFGLQASDIMWTISDTGWILIILCSLMEPWALGACTFVHLL 
PKFDPLVILKTLSSYPIKSMMGAPIVYRMLLQQDLSSYKFPHLQ 
NCLAGGES LLP ETLEN WRAQTGLD I RE F YGQTETGLTCMVS KTM 
KIKPGYMGTAASCYDVQIIDDKGNVLPPGTEGDIGIRVKPIRPI 
G I FS G YVDN P DKTAAN I RGD FW LLGDRG I KDE DG Y FQFMG RADD 
IINSSGYRIGPSEVENALMEHPAWETAVISSPDPVRGEWKAF 
VILALQFLSHDPEQLTKELQQHVKSVTAPYKYPRKIEFVLNLPK 
TVTGKIQRA\KLRDKEWKMSGKAPCAVRHLRDIHLDSPLLSLSF 
PFGPLALPMDGYGDSLWEEHEYKFCLALVISTKLYHVRC 


6826 


2304 


954 


LKTES FKPW/ VN I ALAFHLLGERAS PNS FWQP Y IQTLPRE YDTP 
LYFEEDEVRYLQSTQAIHDVFSQYKNTARQYAYFYKVIQTHPHA 
NKLPLKDSFTYEDYRWAVSSVMTRQNQIPTEDGSRVTLALIPLW 
DMCNHTNGLITTGYNLEDDRCECVALQDFRAGEQIYIFYGTRSN 
AEF^IHSGFFFD^SHDRVKIKLGVSKSDRLYAMKAEVLARAGI 
PTSSVFALHFTEPPISAQLLAFLRVFCMTEEELKEHLLGDSAID 
RIFTLGNSEFPVSWDNEVKLWTFLEDRASLLLKTYKTTIEEDKS 
VLKNHDLSVRAKMAIKLRLGEKEILEKAVKSAAVNREYYRQQME 
EKAPLPKYEESNLGLLESSVGDSRLPLVLRNLEEEAGVQDALNI 
REAI S KAKATENGLVNGENS I PNGTRS ENES LNQES KRAVEDAK 
GSSSDSTAGVKE 
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correftnnnHi nn 

to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=»Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K*Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q-Glut amine, R=Arginine, 
S=Serine, T=Threonine, v=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 


6827 


1 


779 


SSWEFGLSVLGGLFLLFVLENMLGLLRHRGLRPRCCRRKRRNL 
E TRNLD P ENG S GMALQ P LQAAP E PG AQGQRE KNS QH P PALAP PG 
HQGHSHGHQGGTDITWMVLLGDGLHNLTDGLAIGAAFSDGFSSG 
LSTTLAVFCHELPHELGDFAMLLQSGLSFRRLLLLSLVSGALG1. 
GGAVLGVG LS LG P VP LT PWV FG VTAGVFLYVALVDML PALF P S S 
GAP AY A \ HVLLQG LG LLLGG CLMLAI TL LE ERLLP VTT E G 


6828 


3 


1654 


KSQHG/WILQLMHSCKEGYVKDLKGNPGLHRAMLDLDNGTRPSE - 

LGHLSQTASLKRGSSFQSGRDDTWRYKTPHRVAFVEKLTKLVLS 

Q L PN FW KLW I S YVNG S L FS E TAE KSGQ I ER S KNVRQRQND FKKM 

IQEVMHSLVKIiTRGALLPLSIRDGEAKQYGGWEVKCELSGQWLA 

HAIQTVRLTHESLTALEIPNDLLQTIQDLILDLRVRCVMATLQH 

TAE E I KR LAE KED W I VDNEGLTS LP CQFE QC I VCSLQ S L KGVLE 

CKPGEASVFQQPKTQEEVCQLSINIMQVFIYCLEQLSTKPDADI 

DTTHLSVDVSSPDLFGSIHEDFSLTSEQRLLIVLSNCCYLERHT 

FLNIAEHFEKHNFQGIEKJTQVSMASLKELDQRLFENYIELKAD 

PIVGSLEPGIYAGYFDWKDCLPPTGVRNYLKEALVNIIAVHAEV 

FT I S KEL VP R VLS KV I E AVS EELS RLMQCVS S F S KNGALQARLE 

ICALRDTVAVYLTPESKSSFKQALEALPQLSSGADKKLLEELLN 

KFKS SMHLQLTCFQAAS STMMKT 


6829 


1 


782 


MRMEAGEAAP PAGAGGRAAGGWG KW VRLNVGGTVFLTTRQTLCR 
EQKSFLSRLCQGEELQSDRDETGAYLIDRDPTYFGPILNFLRHG 
KLVLDKDMAEEGVLEEAEFYNIGPLIRIIKDRMEEKDYTVTQVP 
PKHVYRVLQCQEEELTQMVSTMSDGWRFEQLVNIGSSYNYGSED 
QAEFLCWSKELHSTPNGLSSESSRKTKSTEEQLEEQQQQEEEV 
EEVEVEQVQVEADAQEK/ CCYKPEAPGCEAPDHLQGLGVP I 


6B30 


1 


939 


MEPGSVENLSI V YRS RD FLWN KH WD VR I DS KAWR ETLT LQ KQL ' ' 

RYRFPELADPDTCYGFRFCHQLDFSTSGALCVALNKAAAGSAYR 

CFKERRVTKAYLALLRGHIQESRVTISHAIGRNSTEGRAHTMCI 

E GS QG CENP KP S LTDL WLEHG L YAGDP VS K VLLKPLTG RTHQL 

RV\HCSALGHPWGDLTYGEVSGREDRPFRMMLHAFYLRIPTDT 

ECVEVCTPDPFLPSLDACWSPHTLLQSLDQLVQALRATPDPDPE 

DRGPRPGSPSALLPGPGRPPPPPTKPPETEAQRGPCLQWLSEWT 

LEPDS 


6831 


3 


1087 


SLFFGSSTPDNKVAEQEDLETQPSPSVEKAVTVIDPEGTIPTNF 
NVAEKPADHSLSEVKLKTADEPRGTLVKSGDGQNVKEKSMILSN 
VEDLQQPKFISEVSREDYGKKEISGDSEEMNINSVVTSADGENL 
EIQSYSLIGEKLVMEEAKTIVPPHVTDSKRVQKPAIAPPSKWNI 
S I FKEE PRSDQKQKSLLS FDWDKVPQQPKSAS SNFASKNI TKE 
SEKPESIILPVEESKGSLIDFSEDRLKKEMQNPTSLKISEEETK 
LRSVSPTEKKDNLENR\SYTL\AEKKVLAEKQNSV\APLELRDS 
N E I G KTQ I TLGS RS TE LKES KADAM PQH FYQNE D YNER P K 1 1 VG 
SEKEKDEKKKK 


6832 


1809 


412 


MGSGLISGPPQDNSGEALKEPERAQEHSLPNFAGGQHFFEYLLV 
VSLKKKRSEDDYEPI ITYQFPKRENLLRGQQEEEERLLKAI PLF 
ruuL*rjY4i\Z}Lj iti fKii i rbr VIj INVUGSRKIGYCRRLLPAGPG 
PRLPKVYCIISCIGCFGLFSKILDEVEKRHQISMAVIYPFMQGL 
REAAFPAPGKTVTLKSFIPDSGTEFISLTRPLDSHLEHVDFSSL 
LHCLSFEQILQIFASAVLERKIIFLAEGLSTLSQCIHAAAALLY 
PFSWAHTYIPWPESLLATVCCPTPFMVGVQMRFQQEVMDSPME 
EVLLWLCEGTFLMSVGDEKDILPPKLQDDILDSLGQGINELKT 
AEQ I NEH VS G P F VQFFVK I VGH YAS Y I KREANGQGHFQ ER S FC K 
ALTS KTNRRF VKKF VKTQLFSLFIQEAEKSKNPP AG YFQQ KILE 
YEEQKKQ/TETKGKNCEIRAWNKND 


6833 


1 


1129 


PLMTLSQCGGIPGHGHSHGGHGHGHGLPKGPRVKSTRPGSSDIN 
VAPGEQGPDQEETNTLVANTSNSNGLKLDPADPENPRSGDTVEV 
Q VNGNL VRE PDHME L EEDRAGQLNMRG VFLHVLGDALG S V I VW 
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NO: 
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corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 

LsLeUPinp MsMpfhinnino M-2anararti na 
aj— xjc -wic f i i—i iclui Ui 1 JL i. 1 , In — J-liipct I. cig -Lilt; t 

P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y-Tyrosine, X ^Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








NALVFYFSWKGCSEGDFCVNPCFPDPCKAFVEIINSTHASVYEA 
GPCWVLYLDPTLCWMVCILLYTTYPLLKESALILLQTVPKQID 
I RNLI KELRNVEGVEEVHELHVWQLAGSRI IATAHIKCEDPTS Y 

CALKQCCGTLPQAPSGKDAEKTPAVSISCLELSNNLEKKPRRTK 
AENIPA\WIEIKN\IPNK\QPESSL 


6834 


78 


1151 


AGQERPAPIWRLLWLPTPSVSRKAEPAHIPINR*GA*E*RGGLP 
LCG S S AS A YG WH * RLT P WS PGG S * HM * S S KA P VTQARE VL VAG P 
CS KLVLSGARG I VGTTVQVLVEAQQ PLLLLFTG VWGLNLRAGE E 
SRAL* LI EE VTQVRDAHLGNAWGCAQCLSQGQVGSAIAKALLE 

rtftrtftviw^wiviij. v 0\jUFi.\j\£i\a Vb VKJj VKJJV^v&cirtVjt^VfcilHjQ 

AHGRPGLALAKGRGGTNEVEEQVQVDGVQKLVLSAHECHELVAG 
QQ DG E DQAARTR LLQ AG AHS VAHG RRQG QAP C R PHQEAG VS CH E 
LQQWGDAL * ARE + APQ 1 I VLLLLEDVAQLRTGKKA*DLWDVE 
QLLRQL 


6835 


1 


834 


G I PAADR \EASLEL I KLDISRTFPNLCI FQQGGP YHDMLHS ILG 
AYTC YRPDVG YVQGMS F I AAVLI LNLDTADAFI AFSNLLNKPCQ 
riftr r KvurHjiiMiji itMr bvr r bfcNIjPKXjrAHrKKNNIjTPDIYL 
IDWIFTLYSKSLPLDLACRIWDVFCRDGEBFLFRTALGILKLFE 
DILTKMDFIHMAQFLTRLPEDLPAEELFASIATIQMQSRNKKWA 
Q VLTALQKD S RE MREGKS VP PTLRLQRE F ALGTNQ S PM P RPLCC 
FRLTPGQPRRTDAL 


6836 


1 


850 


MSCGRPPPDVDGMITLKV\DNLTYRTSPDSLRRVFEKYGRVGDV 
YIPREP.HTKAPRGFAFVRFHDRRDAQDAEAAMDGAELDGRELRV 
QVARYGRRDLPRSRQGRRHAAGPEAA/RYGRRSRSYGRRSRSPR 
RRHRSRSRGPSCSRSRSRSRYRGSRYSRSPYSRSPYSRSRYSRS 
PYSRSRYRESRYGGSHYSSSGYSNSRYSRYHSSRSHSKSGSSTS 
SRSASTSKSSSARRSKSSSVSRSRSRSRSSSMTRSPPRVSKRKS 

"K GP GTTRPDV QDTTTTWrtAMCG 
i\J fvO rvO r^J\)r C IVO fcr LLiIiiLvjyrloO 


6837 


1 


1369 


TDGAAVAGNPGSDYFPGGTAP/GGPRTRRP\SGTSSSGSKASGP 
PNPPAQGDGTSLSPNYTLESTSGNDGKPVSGGGGRGRGRRKRDS 
GHVS PGTFFDKYSAAPDSGGAPGVS P GQQQAS GAAVGGS SAGET 
RGAPTPHEKA'LTflP^wn'K'nAP'T.T.T.nnnDnT.TPCT nrrfiwonacQ 
PNVGE FAS DE VS T S YANEDE VSS S S DNPQALVKAS R S PL VTG S P 
KLPPRGVGAGEHGPKAPPPALGLGIMSNSTSTPDSYGGGGGPGH 
PGTPGLEQVRTPTSSSGAPPPDEIHPLEILQAQIQLQRQQFSIS 
EDQPLGLKGGKKGECAVGASGAQNGDSELGSCCSEAVKSAMSTI 
DLDS LMAEHSAAW YM PADKALVDS ADDDKTLAP WEKAKPQNPNS 
KEAHDLPANKASASQPGSHLQCLSVHCTDDVGDAKARASVPTWR 
SLHSDISNRFGTFVAALT 


6838 


16 


499 


LTDTPPPKTHMIHHSISDYKATLRCWALGFYPMEITLTWQQDEE 
DQTRDMELVETRPAGDGTFQKWAAVWPSGEE/Q/RYMCHVQHE 
Lib f u. f Jb l ijKW EQ&SQPTIPI VG I VAGL VLIX-^VVTGAVVSAVMC 
RKKNSDRVSYSEAASSDHAQGSDVSLTACKV 


6839 


1 


1195 


AAPAGGGPDPEALSAFPGRHLSGLSWPQVKRLDALLSEP I PIHG 
RGNFPTLSVQPRQIRAGGPQHPGGAG\IHVHRVRLHGSAASHVL 
HPES G LG YKDLDL VFRM DLRS EAS FQIjTKAWLACLLDFL PAG V 
SRAKITPLTLKEAYVQKLVKVCTDSDRWSLISLSNKSGKNVELK 
FVDSVRRQFEFSIDSFQIILDSLLLFGQCSSTPMSEAFHPTVTG 
ESLYGDFTEALEHLRHRVIATRS PEE I RGGGLLKYCHLLVRGFR 
PRPSTDVRALQRYMCSRFFIDFPDLVEQRRTLERYLEAHFGGAD 
AAPJ^YACLVTLHRVVNESTVCLMNHERRQTLDL IAALALQALAE 
QGPAATAALAWRPPGTDGWPATVNYYVTPVQPLLAHAYPTWLP 
CN 


6840 


4254 


2061 


ELQGDFSVPDVPKSMAWCENSICVGFKRDYYLIRVDGKGSIKEL 
FPTGKQLEPLVAPLADGKVAVGQDDLTVVLNEEGICTQKCALNW 
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to first 

amino acid 
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Predicted end 
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to first 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide"" 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P^Proline, Q=Glutamine, R-Arginine, 
S=Serine, T«Threonine , V^Valine, 
W-Tryptophan, Y^Tyrosine, X= Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\-possible nucleotide insertion) 








tdipvamehqppyiiavlpryveirtfeprllvq$ielqrprfi 

TSGGSNIIYVASNHFVWRLIPVPMATQIQQLLQDKQFELALQLA 
EMKDDSDS EKQQQ I HH I KNLYAFNLFCQKRFDESMQVFAKLGTD 
PTHVMGLYPDLLPTDYRKOLOYPNPT.PVI.SRAFr.PWawT at Tnv 
LTQKRSQLVKKLNDSDHQSSTSPLMEGTPTIKSKKKLLQIIDTT 
LIiKC YLHTNVALVAPLLRLENNHCH I EES EH VLKKAHKYS EL 1 1 
LYE KKGLH E KALQ VLVDQS KKANS PLKGHERTVQ Y LQHLGTENL 
HLI FS YS VWVLRDFPEDGLKI FTEDLPEVESLPRDRVLGFLIEN 
FKGLAI P YLEHI IHVVJEETGSRFHNCLI QLYCEKVQGLMKEYLL 
SFPAGKTPVPAGEEBGELGEYRQKLLMFLEISSYYDPGRLICDF 
P FDGLLE ERALLLGRMG KHEQALFI YVH I LKDTRMAEE YCHKH Y 
DRNKDGNKDVYLSLLRMYLSPPSIHCLGPIKLELLEPKANLQAA 
LQVLELHHS KLDTTKALNLLPANTQINDI RI FLEKVLEENAQKK 
RFNQVLKNLLHAEFLRV\QEERILHQQVKCIITEEKVCMVCKKK 
IGNSAFARYPNGWVHYFCS\KEVNPADT 


6841 


1 


3206 


TPSTl'GTKSNTPTSSVPSAAVTPLNESLQPLGDYGVGSKNSKRA 
REKRDSRNMEVQVTQEMRNVSIGMGSSDEWSDVQDIIDSTPELD 
MCPETRLDRTGSSPTQGIVNKAFGINTDSLYHELSTAGSEVIGD 
VDEGADLLGEFSGMGKEVGNLLLENSQLLBTKNALNWKNDLIA 
KVDQLSGEQEVLRGELEAAKQAKVKLENRIKELEEELKRVKSEA 
1 1 ARRE P KE E AED VS S YLCTE S D K I P MAQRRR FTRVEMAR VLME 
RNQYKERLMELQEAVRWTEMIRASREHPSVQEKKKSTIWQFFSR 
LFSSSSSPPPAKRPYPSGNIHYKSPTTAGFSQRRNHAMCPISAG 
SRPLEFFPDDDCTSSARREQKREQYRQVREHVRNDDGRLQACGW 
S L P A KYKQ L S PNGGQE DTRM KNVP VP VYCR PL VE KDPTMKL WCA 
AGVNLSGWRPNEDDAGNGVKPAPGRDPLTCDREGDGEPKSAHTS 
PEKKKAKELPEMDATSSRVWILTSTLTTSKWIIDANQPGTWD 
QFTVCNAHVLCISSIPAASDSDYPPGEMFLDSDVNPEDPGADGV 
uau 1 1 Lt VL»Q,A a RCNVPRSNCSSRGDTPVXiDKGQGE VATIANGKV 
NPSQSTEEATEATEVPDPGPSEPETATLRPGPLTEHVFTDPAPT 
PSSGPQPGSENGPEPDSSSTRPEPEPSGDPTGAGSSAAPTMWLG 
AQNGWLYVHSAVANWKKCLHS I KLKDS VLS L VHVKGR VLVALAD 
GTLAIFHRGEDGQWDLSNYHLMDLGHPHHSIRCMAWYDRVWCG 
YKNKVHVIQPKTMQI EKS FDAKPRRESQVRQLAW I GDGVWVS I R 
LDS TLRL YHAHTHQHLQD VD I E P YVS KMLGTG KLGFS FVR I TAL 

LVAGSRLWVGTGMGWT c ?TPT.T'RT\A7T.U»r2ri\ nr\T d amvtc n 

TSGEG\ARPGG\ I IHVYG\DDSSDRAARS fi P YCSMAQAQLCFH 
GHRDAVKFFVSVPGNVLATLNGSVLDSPAEGPGPAAPASEVEGQ 
KLRNVLVLSGGEGYIDFRIGDGEDDETEEGAGDMSQVKPVLSKA 
ERSH I I VWQVS YTPE 


6842 


3 


926 


RCQQLSATILTDHQYLERTPLCAILKQKAPQQYRIRAKLRSYKP' ' 
RRLFQSVKLHCPKCHLLQEVPHEGDLDIIFQDGATKTPDVKLQN 

LSEICKLSNKFNSVIPVRSGHEDLELLDLSAPFLIQGTVHHYGC 
KQWST*RSIQNLNSLVDKTSWIPSSVAEALGIVPLQYVFVMTFT 
LDDGTGVLEAYLMDSDKFFQIPASEVLMDDDLQKSVDMIMDMFC 
PPGIKIDAYPWLECFIKSYOTTNGTDNQICYQIFDTTVAEDVI 


6843 


2 


851 


NHRKVLSGAKRYECNECGKSFAYTSSLIKHRRIHTGERPYECSE 
CGRSFAENSSLIKHLRVHTGERPYECVECX3KSFRRSSSLLQHQR 
VHTRERPYECSECGKSFSLRSNLIHHQRVHTGERHECGQCGKSF 
SRKSSL I IHLRVHTGERPYECSDCGKS FAENSSLI KHLR VHTGE 
RPYECIDCGKSFRHSSSFRRHQRVHTGMRPYK*SKFWKFSCPGF 
LLLQGQRVHTGSRCYECDKWGIFFS*NASFFT+KSAPTEEVPFE 
CNECEKAFSPLSLVTTIFT 


6844 


244 


642 


EHQLAGFELRKTQTSMSLGTTREKTDRVKSTAYLS PQELEDVFY ' 
Q YDVKSE I YS PG I VLWE I ATGD I P FQGCNS E KI RKL VAVKRQQ E 
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Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
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Codon, /"possible nucleotide deletion, 
\=po33ible nucleotide insertion) 








PLGEDCPS ELREI I DECRAHDPSVRPSVDE I LKKLSTFS K* CI K " 
I 


6845 


3 


1519 


VAVRDECYWRHVFWDQDLWMLLFILMCKPETARARLEYRIRTLD 
G ALENAQNLG YQGAKFAWES AD SGLEVCPEDIYGVQE VH VNG A V 
GLAFELYYHTTQDLQLFREAGGWDWRAVAEFWCSRVEW SPREE 
KYHLRGVMSPDEYHSGVNNSVYTNVLVQNSLRFAAALAQDLGLP 
IPSQWLAVADKIKVPFDVEQNFHPEFDGYEPGEWKQADWLLG 
YPVPFSLSPDVRRKNLEIYEAVTSPQGPAMTWSMFAVGWMELKD 
AVRARGLLDRSFANMAEPFKVWTENADGSGAVNFLTGMGGFLQA 
WFG CTG FR VTRAG VT FD P VCL S G I S R VS VS G I F YQGNKLNFS F 
SEDSVTVEVTARAGPWAPHLEAELWPSQSRLSLLPGHKVSFPRS 
AGRIQMSPPKLPGSSSSEFPGRTFSDVRDPLQSPLWVTLGSSSP 
T ES L T VD PAS E * SG TGAS ETS LG PS L WP RLH PPL LGTbliACH PS 
PAARLSGKVHAAWPE FKAFCL 


6846 


213 


1258 


LYFLKTI K* LNRLAEHP * YENEKLTKLRNTIMEQ YTRTEESARG " 
1 1 FTKTRQSAYAJjSQW I TENEK FAE VGVKAHHL IGAGHS S EFKP 
MTQNEQKEVISKFRTGKINLLIATTVAEEGLDIKECNIVIRYGL 
VTNEIAMVQARGRARADESTYVLVAHSGSGVIEHETVNDFREKM 
MYKAIHCVQNMKPEEYAHKILELQMQSIMBKKMKTKRNIAKHYK 
NNPSLITFLCKNCSVTiArRGRnTTTVTRTflWH\7TJM r PDPt?V'T?T vtu 

RENKTLQ KKCAD YQ INGE 1 1 CKCGQ AWGTMM VH KGLDL P CLK I R 
NFVWFKNNSTKKQYKKWVELPITFPNLDYSECCLFSDED 


6847 


1450 


348 


SMCWNSDRLEMPLIDLALILYPPSYVPYTGHLSDDSLSRKYCLT ' 
WFEDALNGVL*RAEAIQPHCVNAGDRMEKFRQKYWNKLQTLRQQ 
PFAYGTLTVRSLLDTREHCLNEFNFPDPYSKVKQRENGVALRCF 
PGVVRSLDALGWEEROLALVTCrST.T.aRMVirriwnavnirG&TrT pcnc 

YFGFEEAKRKLQERPWLVDSYSEWLQRLKGPPHKCALIFADNSG 
ID I ILGVFPFVRELLLRGTEVILACNSGPALNDVTHSESLI VAE 
RIAGMDPWHSALRJSERLLLVQTGSSSPCLDLSRLDKGLAALVR 
ERGADLWIEGMGRAVHTNYHAALRCESLKLAVIKNAWLAERLG 
GRLFS VI FKYEVPAE 


6848 


19 


16 


AMWWNSLDGIRNIVLSNPKKROTLSIJ^LKSLQSDILHDADSND ' 
LKVI 1 1 SAEGPVFS SGHDLKELTEEQGRDYHAEVFQTCS KVMMH 
I RNH P VP VI AM VNG LATAAG CQL VAS C DI AVAS D KS S FAT PG VN 
VGLFCSTPGVALARAVPRKVAIiEMLFTGEPISAQEALLHGLLNK 
WPEAELQEETMRIARKIASLSRPWSLGKATFYKQIjPQDLGTA 
YYLTSQAMVDNLALRDGQEG 1 TAFLQKRKPVWSHEPV* VEH 


6849 


70 


821 


SLGVDGSCLEQGSPAPRPQTDTSP*PVGNWATQQEDLYHQSYEC ' 

VCVLFASVPDFKEFYSESNINHEGLECLRLLNEIIADFDELIjSK 

PKFSGVEKIKTIGSTYMAATGLNATSGQDAQQDAERSCSHLGTM 

VEFAVAIiGSKLDVINKHSFNNFRLRVGLNHGPVVAGVIGAQKPQ 

YDIWGNTVNVASI^ESTGVLGTfTnVTRRTAWaT.nQT/3V , rr , ve'Dr» 

V 1 KVKG KGQ LCT Y FLNTDLT RTG P P S ATLG 


6850 


2 


1235 


ARGLNHEWTFEKLRQHISRNAQDKQELHLFMLSGVPDAVFDLTD " 
LDVLKLELI PEAKI PAK I SQMTNLQELHLCHCPAKVEQTAFS FL 
RDHLRCLHVKFTDVAEIPAWVYLLKNLRELYLIGNLNSENNKMI 
GLESLRELRHLKILHVKSNLTKVPSNITDVAPHLTKLVIHNDGT 
KLLVLNS LKKMMNVAELELQNCELER I PHAI FS LSNLQELDLKS 
NNIRTIEEIISFQHLKRLTCLKLWHNKIVTIPPSITHVKNLESL 
Y FSNNKLE SLPVAVFSJjQKLRCLDVS YNNI SM I P I E I GLLQNLQ 
HLH I TGNKVD I L P KQ LFK C I KLRTLN LGQNC ITS L P E KVGQ LSQ 
LTQL E LKGNCLDRL PAQLGQCRMLKKS G LWEDHL FDTL P L E VK 
EALNQDINIPFANGI 


6851 


1765 


660 


VSAQVSAREGENCLGWNLADSSQESYKSLEEAEDCYPPSLLTLD 
LRDLFNQVEQGPLLSCPKAGTDLSMGRAREVGWMAAGLMIGAGA 
CYCVYKLTIGRDDSEKLEEEGEEEWDDDQELDEEEPDIWFDFET 
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Glutamic Acid, F= Phenylalanine, G^Glycine, 
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S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\»poseible nucleotide insertion) 








MARPWTEDGDWTEPGAPGGTEDRPSGGGKANRAHPIKQRPFPYE 
HKNTWSAQNCKNGSCVLDLSKCLFIQGKLLFAEPKDAGFPFSQD 
INSHLASLSMARNTSPTPDPTVREALCAPDNLNAS IESQGQI KM 
Y 1 NE VCR ET VS RCCNS FLQQ AG LNLL I S MTV I NNMLAKS ASDLK 
FPLISEGSGCAKVQVLKPLMGLSEKPVLAGELVGAQMLFSFMSL 
FI RNGNRE I LLETPAP 


6852 


1 


407 


RTRGEETYANF 1 KHNDGKNI FYAARTPATLFAVMFAMY 1 1 SGLT ' 
G FI GLNS I AVLCNLVMGLALI FLCTWAYVKYSGE FRE I GTVIDQ 
I AE TLWE Q VLKP LGDNLM E EN I RQS VTNS I KAG LTDQ VS HHARL 
KTD 


68S3 


3 


469 


GDSCAVCIELYKPNDLVRILTCNHIFHKTCVDPWLLEHRTCPMC 
KCDILKALGIEVDVEDGSVSLQVPVSNEIFNSASSHEEDNRSET 
AS SG YAS VQGT YEPPLEEHVQSTNE S LQLVNHEANS VAVD VI PH 
VDNPTFEEDETPNQETAVREIKS 


6854 


1148 


585 


HESYIGTFDPGELCVCAAIQWLQDNSASYFLNRKLVYEPSTQAK ' 
PVKNTFLRMWI YSHHI YQQDLRKKI LDVGKRLDVTGFCMTGKPG 
IICVEGFKEHCEEFWHTIRYPNWKHISCKKAESVETEGNGEDLR 
LFHSFEELLLEAHGDYGLRNDYHMNLGQFLEFLKKHKSEHVFQI 
LFGIESKSSDS 


6B55 


1913 


1148 


GRVGGRVGRICSPLSGANEYIASTDTLKTEEVLLFTDQTDDLAK 
EEPTSLFQRDSETKGESGLVLEGDKEIHQIFEDLDKKLALASRF 
Y I P E G CI QRWAAE MWALDALHRE G I VCRDLNPNNILLNDRGHI 
QLTYFSRWSEVEDSCDSDAIERMYCAPEVGAITEETEACDWWSL 
GAVL FELLTG KTLVECHPAG INTHTTLNMPE WVS EEARS L I QQL 
LQFN P LERLG AGVAG VED IKSHPFFTP VDW AE LMR 


6856 


1617 


' 997 


VTQLYVSVDASTKDSLKKIDRPLFKDFWQQFLDSLKALAVKQQR' " 
TVYRLTLVKAWNVDELQAYAQLVSLGNPDFI E VKGVTYCGES SA 
SSLTMAHVPWHEEWQFVRELVDLIPEYEIACEHEHSNCLLIAH 
RKFKIGGEWWTWINYNRFQELIQEYEDSGGSKTFSAKDYMARTP 
HWAL FGAS ERGFDPKDTRHQRKNKS KAI SGC 


6857 


1 


617 


KGPEATAMVCVCSHPNCRQNrilKPSHSAAQTt^CGSPTPASAPNH "" 
KLMAMEQGKTLPSATEDAKEEGLEAQISRLAELIGRLESKALWF 
DLQQRLSDEDGTNMHLQLVRQEMAVCPEQLSEFLDSLRQYLRGT 
TG VRNCFH I TAVRLSDG FTF VI YEF WE TEEAWKRHLQS PL C JCAF 
RHVKVDTLSQPEALSRILVPAAWCTVGRD 


6858 


2 


669 


RSRG I KDFENDPPLSSCGI FQSRIAGDALLJDSGIRISSVFAS PA " 
LRCVQTAKLILEELKLEKKIKIRVEPGIFEWTKWEAGKTTPTLM 
SLEELKEANFNIDTDYRPAFPLSALMPAESYQEYMDRCTASMVQ 
I VNTC PQDTG V I L I VSHGS TLDS CTRP LLGLP PRE CGDFAQL VR 
KIPS LGMCFCEENKEEGKWELVNPPVKTLTHGANAAFNWRNW I S 
GN 


6859 


1 


1150 


GETMFKKAKTKAKKKPRKRSDSSGGYNLSDI TO<? P<? <?Tr5T.T.X'Qf3 ' 
KTNSVESLPELLTSDSEGSYAGVGSPRDLQSPDFTTGFHSDKIE 
AKVKP YVNGTS P VYS REDLKP WEKS P I LKISAPQP I PSNRI DTT 
SSAS WVAGS FS PVS PPWDLRTIME I EESRQKCGATPKSHLGKT 
VSHGVKLSQKQRKMIALTTKENNSGMNSMETVLFTPSKAPKPVN 
AW AS S LHS VS S KS FRDFLLE E KKS VTS H S SGDHVKKVS FKG I EN 
SQAPKIVRCSTHGTPGPEGNHISDLPLLDSPNPWLSSSVTAPSM 
VAPVTFASIVEEELQQEAALIRSREKPLALIQIEEHAIQDLLVF 
YEAFGN PEE F V I VERTPQGPLA VPM WNKHGC 


6860 


1889 


1515 


DKDKKRQKKRG I FPKVATN IMRAWLFQHLTHP YPS E EQKKQLAQ 
DTGLTILQVNNWFINARRIIVQPMIDQSNRAVSQGAAYSPEGQP 
MGS F VLDGQQHMGI RPAG PMSGMGMNMGMDGQWHYM 


6861 


1889 


1515 


DKDKKRQKKRGIFPKVATNIMRAWLFQHLTHPYPSEEQKKQLAQ" 

dtgltilqvnnwfinarriivqpmidqsnravsOgaayspegqp 1 
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W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








MGS FVLDGQQHMG I RPAGPMSGMGMNMGMDGQWHYM 


6862 


2 


471 


EEIDREFHNKLKLKEDKLEKQEKPVNGEDKGDSGVDTQNSEGNA~~ 
DEEDPLGPNCYYDKTKSFFDNISCDDNRERRPTWAEERRLNAET 
FG I PLRPNRGRGG YRGRGGLGFRGGRGRGGGRGGTFTAPRGFRG 
GFRGGRGGREFADFEYRKTTAFGP 


6863 


2216 


487 


PQ E P ALKS E FS Q VASNT I P LP L PQ PNTC KDNG P CKQ VC S TVGGS 
AICSCFPGYAIMADGVSCEDQDECLMGAHDCSRRQFCVNTLGSF 
YCVNHTVLCADGYILNAHRKCVDINECVTDLHTCSRGEHCVNTL 
GSFHCYKALTCEPGYALKDGECEDVDECAMGTHTCQPGFLCQNT 
KGSFYCQARQRCMDGFLQDPEGNCVDINECTSLSEPCRPGFSCI 
NTVGSYTCQRNPLICARGYHASDDGTKCVDVNECETGVHRCGEG 
QVCHNLPGSYRCDCKAGFQRDAFGRGCIDVNECWASPGRLCQHT 
CENTliGSYRCSCASGFLLAADGKRCEDVNECEAQRCSQECANlY 
GSYQCYCRQGYQLAEDGHTCTDIDECAQGAGILCTFRCLNVPGS 
YQCACPEQGYTMTANGRSCKDVDECALGTHNCSEAETCHNIQGS 
FRCLRFECPPNYVQVSKTKCERTTCHDFXECQNSPARITHYQLN 
FQ TG LL V PAH I FR IG P APAFTGDT 1 ALN 1 1 KGNE EG Y FGT RRLN 
AYTGWYLQRAVLEPRDFALDVEMKLWRQGS VTTFLAKMH I FFT 
TFAL 


6864 


2 


2933 


LADS S PSNLQ III KELLSMHHQPDPALTKEFD YLP P VDS RS S SG " ~ 

FVGLRNGGATCYMNAVFQQLYMQPGLPESLLSVDDDTDNPDDSV 

FYQVQSL.FGHLMESKLQYYVPENFWKIFKMWNKELYVREQQDAY 

EFFTSLIDQMDEYLKKMGRDQIFKNTFQGIYSDQKICKDCPHRY 

ERE EAFMALNLGVTS CQSLEISLDQ FVRGEVLEGS NAY YCE KCK 

EKRITVKRTCI KSLPSVLVIHLMRFGFDWESGRS IKYDEQIRFP 

WMLNMEPYTVSGMARQDSSSEVGENGRSVDQGGGGSPRKKVALT 

ENYELVGVIVHSGQAHAGHYYSFIKDRRGCX3KGKWYKFNDTVIE 

EFDLNDETLEYECFGGEYRPKVYDQTNPYTDVRRRYWNAYMLFY 

QRVSDQNSPVLPKKSRVSWRQEAEDLSLSAPSSPEISPQSSPR 

PHRPNNDRLSILTKLVKKGEKKGLFVEKMPARIYQMVRDENLKF 

MKNRDVYSSDYFS F VLSLASLNATKLKHP YYPCMAKVSLQLAIQ 

FLFQTYLRTKKKLRVDTEEWIATIEALLSKSFDACQWLVEYFIS 

SEGRELIKIFLLECNVREVRVAVATILEKTLDSALFYQDKLKSL 

HQLLEVLLALLDKDVPENCKNCAQYFFLFNTFVQKQGIRAGDLL 

LRHSALRHM I S FLLGASRQNNQ IRRWS S AQARE FGNLHNTVALL 

VLHS DVS SQRNVAPG I FKQRP PIS IAP S S PLLPLHEE VEALLFM 

S EGKP YLLE VM FALRE LTGS LLAL I EMWYCCFCNEHFS FTMLH 

FIKNQLETAPPHELKNTFQLLHEILVIEDPIQVERVKFVFETEN 

GLLALMHHSNHVDSSRCYQCVKFLVTLAQKCPAAKEYFKENSHH 

WSWAVQWLQKKMSEHYWTLQSNVSNETSTGKTFQRTISAQDTLA 

YATALLNE KEQ S GS SNGS ES S PANENGDRHLQQGS E S PMM I G E L 

RSDLDDVDP 


6865 


1820 


1242 


DPERWKHLSKVTPPGSSVSTTPVQWRLQSPQSQGSMMPSCNRS ' 
CSCSRGPSVEDGKWYGVRSYLHLFYEGYAVPPKLEGIGEGEFLV 
LDQRAADYNQALGTCRLAGTALCVAAGVLLAI CLFWAM IGWLSQ 
DTKAEPLDPEADSHVEVFGDE PEQQLS P I FRNASGQSWFSPPAS 
PFGQSSVQTIQPKRDS 


6866 


1571 


495 


DCPRPRYTLYGLRATCMRbLDWAWINAVSAFKALEQDLPVNIKF 
I I EGME EAGS VALE EL VE KEKDRFFSGVDYIVIS DNL W I S QRKP 
AI T YGTRGNS YFM VE VKCRDQDFHS GTFGG I LHEPMADLVALLG 
S hVDSSGHI LVPGI YDEWPLTEEE INTYKAIHLDLEE YRNSSR 
VE KFLFDTKE E I LMHLWR YPS LS I HGI EGAFDE PGTKTV I PGRV 
IGKFSIRLVPHMNVSAVEKQVTRHLEDVFSKRNSSNKMWSMTL 
GLHPWIANIDDTQYLAAKRAIRTVFGTEPDMIRDGSTIPIAKMF 
QEIVHKSWL I PLGAVDDGEHSQNE KINRWNY I EGTKLFAAFFL 
EMAQLH 
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(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X"Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


6867 


2833 


1704 


GTR I M S Q P KQ KELAG F VRO KMLLD YS VYMG RC VPO F 9 R q po P a. p 
LQ S AE S S P TAG KKLPE VP P S EE EE Q EAWVNALLGR I FW D FLGE K 
YWSDLVSKKIQMKLSKIKLPYFMNELTLTELDMGVAVPKIIiQAF 
KPYVDHQGLWIDLEMSYNGSFLMTLETKMNLTKLGKEPLVEALK 
VGEIGKEGCRPRAFCLADSDEESSSAGSSEEDDAPEPSGGDKQL 
LPGAEGYVGGHRTSKIMRFVDKITKSKYFQKATETEFIKKKIEE v 
VSNTPLLLTVEVQECRGTLAVNI P P P PTDRVWYGFRKP PHVELK 
ARPKLGEREVTLVHVTDWIEKKLEQEFQKVFVMPNMDDVYITIM 
HSAMDPRSTSCLLKDPPVEAADQP 


6868 


1 


346 


RPTRPPTRPEEIKNLILPYISDMNFVQDLCEDFYELFKTDKGFD 
KATFESQMS VMRGQILNLTQALRDGKS PFQLVQ I PCVI VERSQG 
GSQGR I VHLSNS FTQTVNCRKPFFS S W 


6869 


3 


1619 


MYMERMDKRALISFWESVEHLKNANKNEIPQLVGEIYQNFFVES 
KEISVEKSLYKErOOrT.VfiNKnTFVFVFf tot« nvvwTT tmo woo 

FIVSDLYEKLLIKEEEKHASQMISNKDEMGPRDEAGEEAVDDGT 

NQINEQASFAVNKLRELNEKbEYKRQALNSIQNAPKPDKKIVSK 

LKDEIILIEKERTDLQLHMARTDWWCENLGMWKASITSGEVTEE 

NGEQL P CYFVMVS LQE VGG VETKNWTVP KRLS E FHNLHRKLS EC 

VPSLKKDQLPSLSKLPFKSIDHTFMEKFENQLNKFLQNLLSDER 

LCQSEALYAFLSPSPDYLKVIDVQGKKNSFSLSSFLERLPRDFF 

SHQEEETEEDSDLSDYGDDVDGRKDALAEPCFMLIGEIFELRGM . 

FKWVRRTLIALVQVTFGRTINKQIRDTVSWIFSEQMLVYYINIF 

RDAFWPNGKLAPPTTIRSKEQSQETKQRAQQKLLENIPDMLQSL 

VGQQNARHGIIKIFNALQETRANKHLLYALMELLLIELCPELRV 

HLDQLKAGQV 


6870 


1 


1566 


piMMv v/\>a.i KwwyiiijijV ^xjoAA^MQjAiCjAr , QPPNIJjl>LLMDDMGWG 
D LG V YGE P S RE T PNLDRMAAEGLLF PNFY SAN P LC S P SRAAL LT 
GRLP I RNG F YTTNAHARNAYT PQE I VGG I PDSEQLLPELLKKAG 
YVSKIVGKWHLGHRPQFHPLKHGFDEWFGSPNCHFGPYDNKARP 
NI P VYRDWEMVGR YYEEFP INLKTGEANLTQI YLQEALDF I KRQ 
ARHHPFFLYWAVDATHAPVYAS KPFLGTSQRGR YGDAVRE I DDS 
IGKILELLQDLHVADNTFVFFTSDNGAALISAPEQGGSNGPFLC 
G KQTTFEGGMRE PAIAWW PGHVTAGOVSHOLGS I MDT ,ptt ct at 

AGLTPPSDRAIDGLNLLPTLLQGRLMDRPIFYYRGDTLMAATLG 
QHKAH FW T WTNS W ENFRQG I D FCPGQNVS G VTTHNLEDH TKL PL 
IFHLGRDPGERFPLSFASAEYQEALSRITSWQQHQEALVPAQP 
QLNVCNWAVMN WAP PG CEKLG KCLT P P E S I P KKCLW S H 


6871 


209 


1126 


RMSLNPPIFLKRSEENSSKFVETKQSQTTSIASEDPLQNLCIiAS 
QEVLQKAQQSGRSKCLKCGGSRMFYCYTCYVPVENVPIEQIPLV 
KLPLKIDI IKHPNETDGKSTAIHAKLI*APEFVNI YTYPCI PEYE 
EKDHEVALIFPGPQSISIKDISFHLQKRIQNNVRGKNDDPDKPS 
FKRKRTEEQEFCDLNDSKCKGTTLKKIIFIDSTWNQTNKIFTDE 
RLQGLLQVELKTRKTCFWRHQKGKPDTFLSTIEAIYYFLVDYHT 
D I LKEKYRGQYDNLLFF YS FMYQLIKNAKCSGDKETGKLTH 


6872 


880 


459 


FG LLMVVL S L I FMKGNC VRE DL I FN FL F KLGLD VRETNGL FGNT 
KKLITEVFVRQKYLEYRRIPYTEPAEYEFLWGPRAFLETSKMLV 
LRFLAKLHKKDPQSWPFHYLEALAECEWEDTDEDEPDTGDSAHG 
PTSRPPPR 


6873 


1929 


955 


DEQAVLCSKDKTYDLKIADTSNMLLFIPGCKTPDQLKKEDSHCN 
I IHTE I FG FSNNYWELRRRRPKLKKLKKLLMENPYEGPDSQKEK 
D S NS S K YTT E DL LDQ I QAS E E E I MTQI »Q VI iNACK I GG Y WR I LE F 
DYEMKLLNHVTQLVDSESWSFGKVPLNTCLQELGPLEPEEMIEH 
CLKCYGKKYVDEGEVYFELDADKICRAAARMLLQNAVKFNLAEF 
QEVWQQSVPEGMVTSLDQLKGLALVDRHSRPEIIFLLKVDDLPE 
DNQERFNSLFSLREKWTEEDIAPYIQDLCGEKQTIGALLTKYSH 
SSMQNGVKVYNSRRPIS 
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Amino acid segment containing signal peptide 
{A=Alanine, C— Cysteine , D=Aspartic Acid/ E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine ( N=Asparagine, 
P=Proline, Q=Glutamine,. R=Arginine, 
S^Serine, T=Threonine, V=*Valine, 
W«Tryptophan, YoTyrosine, X»Unknown, +-Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6874 


1 


307 


DS I ADHVNS AAVNVE EGTKNLGKAAKYKLAALP VAGAL I GGM VG 
GPIGLLAGFKVAGIAAALGGGVLGFTGGKLIQRKKQKMMEKLTS 
SCPD'LPSQTDKKCS 


*875"" 


1688 


349 


VIGTGERGNSASEKWEIMFNEELGDPFIIIHSISLLNAEEHSIA 
TLLLRIEKEEbDMKGSGFYVSLEWVTISKKNQDNKKYEIIKRDI 
LRGKS VPH YAAI E PDGNGLM I VS YKS LTFVQAGQDLEENMDED I 
SEKIKEPLYYWQQTEDDLTVTIRLPEDNTKEDIQIQFLPDHINI 
VLKDHQFLEGKL YSS IDHESSTWI I KESNSLE I SLI KKNEGLTW 
PELVIGDKQGELIRDSAQCAAIAERLMHLTSEELNPNPDKEKPP 
CNAQELEECDIFFEESSSLCRFDGNTLKTTHVVNLGSNQYLFSV 
IVDPKEMPCFCLRHDVDALLWQPHSSKQDDMWEHIATFNALGYV 
QASKRDKKFFACAPNYSYAALCECLRRVFIYRQPAPMSTVL.YNR 
KEGROVGOVAKOOVASLETNnPT"LOPnATNRRT.P\7T.TTKTJT.irr T 
KVNTEN 


6876 


41 


1285 


VGEMTLIWRHLLRPLCLVTSAPRILEMHPFLSLGTSRTSVTKLS 
LHTKPRMPPCDFMPERYQVIFLVNSGSEANELAMLMARAHSNNI 
DIIS FRGAYHGCS P YTLGLTNVG I YKMELPGGTGCQPTMCPD VF 
RGPWGGSHCRDS PVQTI RKCSCAPDCCQAKDQ Y I EQFKDTLSTS 

EVQTGFGRLGSHFWGFQTHDVLPDIVTMAKGIGNGFPMAAVITT 
PEIAKSLAKCLQHFNTFGGNPMACAIGSAVLEVIKEENLQENSQ 
EVGTYMLLKFAKLRDEFEIVGDVRGKGLMIGIEMVQDKISCRPL 
PREEVNQIHEDCKHMGIiLVGRGSIFSQTFRIAPSMCITKPEVDF 
AVEVFRSALTQHMERRAK 


6877 


1 


778 


QAKLFTVPSEALAIAVATEWDSQQDTIKYYTMHLTTLCNTSLDN 
PTQRNKDQLIRAAVKFLDTDTICYRVEEPETLVELQRNEWDPII 
EWAEKRYGVEISSSTSIMGPSIPAKTREVLVSHLASYNTWALQG 
IEFVAAQLKSMVLTLiGLlDLRLTVEQAVLLSRLEEEYQIQKWGN 
IEWAHDYELQELRAPJTAAGTLFIHLCSESTTVKHKLLKE 


6878 


931 


263 


QTLQGDFKNRAEMIDFNIRXKNVTRSDAGKYRCEVSAPSEQGQN 
LEEDTVTI iPVT >VAP AVP Q r"P\/D Q Q a t . Q r!T\A7PT p nnn wnu did 

EYTWFKDGIRLLENPRLGSQSTNSSYTMNTKTGTLQFNTVSKLD 
TGEYSCEARNSVGYRRCPGKRMQVDDIjNISGI IAAWWALVIS 
VCGLGVC YAQRKGYFSKETS FQKSNSS SKATTMS ENDFKHTKSF 
II 


6879 


3 

• 


845 


IRVIGESDIMQEFLSESDENYNGVSDVELRVALPDGTTVTVRVK 
KNSTTDOVYOAIAAKVGMD^TTUNYPAr.PWT <zu QWRVT.a pwp 

FPHKLYIQNYTSAVPGTCLTIRKWLFTTEEEILLNDNDLAVTYF 
FHQAVDD VKKGY I KAEE KS YQLQKLYEQRKMVM YLNMLRTCEG Y 
NEIIFPHCACDSRRKGHVTTAT^TTHPKTiHAPTPPRnT.PMnVTA 

FEWDEMQRWDTDEEGMAFCFEYARGEKKPRWVKIFTPYFNYMHE 
CFERVFCELKWRKEEY 


6880 


2110 


1437 


RKDN CTAKE W T F P E AKWNTT AR VFS H I RLGMGH VL 1 1 VQ C F I S S 
MAN I YNEKI LKEGNQLTE S I F IQNS KL YFFG I LFNGLTLGLQRS 
NRDQ I KNCGFFYGHRAFS VAL IFVTAFQGLSVAF I LKFLDNM FH 
VLMAQ VTT V 1 1 TTVS VLVFD FRP S L E F F LEAP S VLLS I F I YNAS 
KPQVPEYAPRQERIRDLSGNLWERSSGDGEELERLTKPKSDESD 
EDTF 


6881 


2638 


2244 


N DS K WE D I HV I TGAL KM P FRE L P E P L FT FNH FND F VN A I KQ E P R 
QRVAAVKDLIRQLPKPNQDTMQILFRHLRRVIENGEKNRMTYQS 
IAIVFGPTLLKPEKETGNIAVHTVYQNQIVELILLELSSIFGR 


6882 


1 


850 


GIPEAQLWIYPVKSCKGVPVSEAECTAMGLRSGNLRDRFWLVIN 
QEGNMVTARQBPRLVLISLTCDGDTLTLSAAYTKDLLLPIKTPT 
TNAVHKCRVHGLE I EGRDCGEATAQWI TS FLKSQP YRLVHFEPH 
MRPRRPHQIADLFRPKDQIAYSDTSPFLILSEASLADLNSRLEK 
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Amino acid segment containing signal peptide"" 
(A=Alanine, C«=Cysteine, D«Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H»Histidine, I-Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R*Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y= Tyrosine, X«Unknown, *»Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








KVKATNFRPNIVISGCDVYAEDSWDELLIGDVELKRVMACSRCI 
LTTVDPDTGVMSRKBPLETLKSYRQCDPSERKLYGKSPLFGQYF 
VLENPGTIKVGDPVYLLGQ 


6883 


2794 


2256 


NSKLKLNQNLKLFITLTYQVLSLHGl^GPGIHLQKEGAFPVtQNR 
ALQLLYDLR YLNI VLTAKGDEVKSGRSKPDSR I EKVTDHLEAL I 
DPFDLDVFTPHLNSNLHRLVQRTSVLFGLVTGTENQLAPRSSTF 

NSQEPHNILPLASSQIRFGLLPLSMTSTRKAKSTRNIETKAQYD 
ANC 


6884 


2 


99 


EFERVTAEAVKPRETSEPRAAAQRFCEKFPFL 


6885 


297 


1554 


STGQFWHVTDLHLDPTYHI TDDHTKVCAS SKGANASNPG P FGDV 
LCDS PYQLILSAFDFIKNSGQEASFMIWTGDSPPHVPVPELSTD 
TVINVITNMTTTIQSLFPNLQVFPALGNHDYWPQDQLSWTSKV 
YNAVANLWKPWLDEEAISTLRKGGFYSQKVTTNPNLRIISLNTN 
LYYGPNIMTLNKTDPANQFEWLESTLNNSQQNKEKVYIIAHVPV 
GYLPSSQNITAMREYYNEKLIDIFQKYSDVIAGQFYGHTHRDSI 
MVLS DKKGS P VNS LFVAPA VT P VKS VLEKQTNNPG I RL FQYDPR 

DYKLLDMLQYYLNLTEANLKGESIWKLEYILTQTYDIEDLQPES 
LYGIjAKQFT ILDS KQF I KY YNYFFVS YDSS VTCDKTCKAFQ I CA 
IMNLDNISYADCLKQLYI KHNY 


6686 


2 


1341 


QCGGIPGREGGSSRPLEEGTGSSPACVRGAAPGSEDAFYPTRAK 
QARVSQELKKAAJCRTVSISEGPDTLGDGMRERRETLALAPEPEP 
LEKEACEKWKRPFRSASATSLTLSHCVDWKGLLDFFCKRRGHSI 
GGAPEQRYQI I PVCVAARL PTRAQD VLDAHLS EVNAVR FGPNS S 
LLATGGADRLIHLWNWGSRLEANQTLEGAGGSITSVDFDPSGY 
QVLAAT YNQAAQLW KVGEAQS KETLSGHKDKVTAAKFKLTRHQA 
VTGSRDRTVKEWDLGRAYCSRTINVLSYCNDWCGDHIIISGHN 
DQKIRFWDSRGPHCTQVIPVQGRVTSLSLSHDQLHLIiSCSRDNT 
LKVI DLR VSNI RQVFRADGFKCGSDWTKAVFS PDRS YALAG S CD 
GAL Y I WDVDTGKL E S RLQG PHCAAVNAVAWCY S GSHMVS VD QG R 
KWLWQ 


6887 


104 7 


116 


WTARPSQKPFWEAGAVPGDPLSTGCSQAQLGdCCPRGPWGPQHG 
GQQRAAGPTLPRGERGGPQQSGPGLAAQTPPTSKQVAWRAFLTG 
TYRSQSPRSPAGPFRGGTGWWPEPAVCLCVAVGPQRLSSPGLVY 
NASG S EHC YD I YRL YH S CAD P TG CGTG PDARAWD YQACTE I NLT 
FASNNVTDMFPDLPFTDELRQRYCLDTWGVWPRPDWLLTSFWGG 
DLRAASNIIFSNGNLDPWAGGGIRRNLSASVIAVTIQGGAHHLD 
LRAS H P ED P AS WE ARKLEAT 1 1 GEWVKAARREQQ PALRGG PRL 
SL 


6888 


1 


992 


FVAYVKKE I PHI WTHCLLNPHALVI KTLPTKLRDALFTWRVI 
NFIKGRAPNHRLFQAFFEEIGIEYSVLLFHTEMRWLSRGQILTH 
IFEMYEEINQFLHHKSSNLVDGFENKEFKIHLAYLADLFKHLNE 
LSASMQRTGMNTVSAREKLSAFVRKFPFWQKRIEKRNFTNFPFL 
EEIIVSDNEGIFIAAEITLHLQQLSNFFHGYFSIGDLNEASKWI 
LDPFLFNIDFVDDSYLMKNDLAELRASGQILMEFETMKLEDFWC 
AQFTAFPNLAKTALE I LMPFATTYLCELGFS I TFTFQNKVPEAA 
LILSDDIRVAISKKVPSFLGHH 


6889 


1 


1534 


LTLENQIKEEREQDNSES PNG RTS PL VS QNNEQGS T LRD LLTTT 
AG KLR VGS TDAG I A FAP VYSMGAP S S KS GR TMPNILDD 1 1 AS W 
ENKI PPS KTSKINVKPELKEEPEES I ISAVDENNKLYSDI PHSW 
I CE KH I L WLKD YKNS SN WKLFKEC WKQGQPAWSGVHKKMMI S L 
WKAESISLDFGDHQADLLNCKDSIISNANVKEFWDGFEEVSKRQ 
KNKSGETWLKLKDWPSGEDFKTMMPARYEDLLKSLPLPEYCNP 
EGKFNLASHLPGFFVRPDLGPRLCSAYGWAAKDHDIGTTNLHI 
EVSDWNILVYVG1AKGNGILSKAGILKKFEEEDLDDILRKRLK 
DS S E I PGALWH I YAGKD VDKI REFLQ K I S KEQGLE VL P EHD P IR 
DQSWYVNKKLRQRLLEEYGVRTWTLIQFLGDAIVLPAGALHQVQ 
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NFHSCIQVTEDFVSPEHLVESFHLTQELRLLKEEINYDDKLQVk"' 
NIL YHA VKEM VRAL K I HEDE VDDMEEN 


6890 


3 


667 


THACGMWIPLYLHRALWHKTAETCNSPPCGAKDSLIFGAITCF 
TG FLGVDTGAG ATRWCRLKTQRADPLVCAVGMLGSAI FICLIFV 
AAKSSIVGAYICIFVGETLLFSNWAITADILMYVVIPTRRATAV 
ALQ S FTS HLLG DAGS PYLIGFISDLIRQSTKDS PLW E FL S LG YA 
LML C P FWVLGG MF FIiATALF F VSD RARAEQ Q VNQLAM P P AS VK 
V 


6891 


1980 


1262 


LRIHQELLSKELKLLRGITIESIIHIGLAAGKEQFMQDASNVMQ" 

LLLKTQSHLYNMEDNNPEVRQAAAYGLGVMAQFGGDDYRSLCSE 

AVPLLVKVIKRAHSKTKKNVIATENCISAIGKILKFKPNCVNVD 

EVLPHWLSWLPLHEDKEEAIQTLSFLCDLIESNHPVVTGPNNSN 

LPKIISIIAEGKINETINYEDPOUCRLANVVRQVQTSEDLWLEC 

VSQLDDEQQEALQELLNFA 


6892 


3 


876 


RSVAAASGPGAWGTDHYCLELLRKRDYEGYLCSLLLPAESRSSV " 

FALRAFNVELAQVKDSVSEKTIGLMRMQFWKKTVEDIYCDNPPH 

QPVAIELWKAVKRHNLTKRWLMKIVDEREKNLDDKAYRNIKELE 

NYAENTQSSLLYLTLEILGIKDLHADHAASHIGKAQGIVTCLRA 

TPYHGSRRKVFLPMDICMLHGVSQEDFLRRNQDKNVRDVIYDIA 

SQAHLHLKHARSFHKTVPVKAFPAFLQTVSLEDFLKKIQRVDFD 

IFHPSLQQKNTLLPLYLYIQSWRKTY 


6893 


1 


842 


DGERKSMSVERTFSEINKAEEQYSLCQELCSEIAQDLQKERLKG 
RTVT I KLKNVN FEVKTRAS T VSS WS TAEE I FAI AKELLKTE ID 
AD F P H P LRLRLMG VR I S S FPNE EDR KHQQRS 1 1 G F LQAGNQALS 
ATECTLEKTDKDKFVKPLEMSHKKSFFDKKRSERKWSHQDTFKC 
E AVN KQ S FQTS Q P FQ VLKKKMNENL E I S ENS DDCQ I LTC P VC FR 

AQGC1SLEALNKHVDECLDGPSISENFKMFSCSHVSATKVNKKE 
NVPAS S LCEKQD YEAH 


6894 


1742 


1463 


TTLCKPLVPREHQFYETLPAEMRKFTPQYKGKSQLLEGLPHWRG " 

DVRDRGHGRPWQPSLEPSLPPTLCFPSLSSFSSSWPSAQHLTPS 
VFNPW 


6895 


2379 


478 


VTYVELCDLASPTALLIMRTVLDLIVEDLQSTSEDKEQQYTSQT 
TRLLALLYALASHKACKLAILHLINGTIKGDERYAEIFQDLLAL 
VRSPGDSVIRQQCVEYVTSILQSLCDQDIALILPSS3EGSISEL 
EQLSNSLPNKELMTSICDCLLATLANSESSYNCLLTCVRTMMFL 
AEHD YG L FHLKS S LR KNS S ALH S LL KR WS TFS KDTGE LAS S FL 
E FMRQ I LNS DT I GCCGDDNGLME VEGAHTSRTMS INAAELKQLL 
QS KEES PENLFLELEKLVLEHS KDDDNLDS LLDS WGLKQMLES 
SGDPLPLSDQDVEPVLSAPESLQNLFNNRTAYVLADVMDDQLKS 
MWFTPFQAEEIDTDLDLVKVDLIELSEKCCSDFDLHSELERSFL 
SEPSSPGRTKTTKGFKLGKHKHETFITSSGKSEYIEPAKRAHVV 
P P PRGRG RGG FGQG I R PHD I FRQRKQNTSRP PSMHVDD FVAAES 
KEWPQDGIPPPKRPLKVSQKISSRGGFSGNRGGRGAFHSQNRF 
FTPPASKGNYSRREGTRGSSWSAQNTPRGNYNESRGGQSNFNRG 
PLPPLRPLSSTGYRPSPRDRASRGRGGLGPSWASANSGSGGSRG 
KFVSGGSGRGRHVRS FTR 


6896 


1 


555 


GNI VIQKKKYNKQHI I PLENVTI DS I KDEGDLRNG WLI KTPTKS 
FAVYAATATEKSEWMNH INKC VTDLLS KSGKTPSNEHAAVWVPD 
SEATVCMRCQKAKFTPVNRRHHCRKCGFWCGPCSEKRFLLPSQ 
SSKPVRICDFCYDLLSAGDMATCQPARSDSYSQSLKSPLNDMSD 
DDDDDDSSD 


6897 


3 


920 


GDGLMHEWNGLMERPDWETAIQKPLCSLPAGSGNALAASLNHY 
AGYEQVTNEDLLTNCTLLLCRRLLSPMNLLSLHTASGLRLFSVL 
SLAWGFIADVDLESEKYRRLGEMRFTLGTFLRLAALRTYRGRLA 
YLPVGRVGSKTPASPVWQQGPVDAHLVPLEEPVPSHWTWPDE 
DFVLVLALLHSHLGSEMFAAPMGRCAAGVMHLFYVRAGVSRAML 
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P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W^Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








LRLFLAMEKGRHMEYECPYLVYVPWAFRLEPKDGKGVFAVDGE 
LMVSEAVQGQVHPNYFWMVSGCVEPPPSWKPQQMPPPEEPL 


6898 


919 


346 


QKTVTAVASLLKGRQGIYTENERRMGAVIKIRFFKIMLVLIICW 
LSNIINESLLFYLEMQTD1NGGSLKPVRTAAKTTWFIMGILNPA 
QG FLLSLAFYGWTGCS LG FQS PRKE I QWESLTTS AAEGAHPS PL 
MPHENPASGKVSQVGGQTSDEALSMLSEGSDASTIEIHTASESC 
NKNEGD PALPTHGDL 


6899 


120 


827 


MKVRKNNDAYLLDKNKINMDCFISCFFKKMLTTLMFSHSGILSL 
LEHGEEYTFSLPCAYARSILTVPWVELGGKVSVNCAKTGYSASI 
T FH T K P F YGG KLH R VTAE VKHN I TNT WCR VQG E WNS VLE FTY S 
NGETKYVDLTKLAVTKKRVRPLEKQDPFESRRLWKNVTDSLRES 
EIDKATEHKHTLEERQRTEERHRTETGTPWKTKYFIKEGDGWVY 
HKPLWKI I PTTQPAE 


6900 


3 


451 


TE VLG S KG I HE LR S S TS ALHHALE E S AS LLTM F WRAALP S TH1 P 
VLPGKVGESTERELLELRTKVSQQEQLLQSTTEHLKNANQQKES 
MEQFI VSQLTRTHDVLKKARTNLEVR KLLHQSEAPSLS PTHHHP 
LADLVGDSWPALRFQEK 


6901 


1 


201 


DDNM VQRLE TD FKMTLQQQ S TLEQ WAAWLDNVMMQ ALK P YEGR P 
SFPKAARQFLLKWS FYRYHLGFS 


£902 


2 


267 


GAPPPPPSQPPRQPPQAAPSSHPHSDLTFNPSSALEGQAGAQGA 
SDMPEPSLDLLPELTNPDELLSYLDPPDLPSNSNDDLLSLFENN 


6903 


1 


149 


RINQVYRQGPTGIHILVIDQMVQNFQDESCFLFSTVKAESSDGI 
HULK 


6904 


464 


2092 


ME AS L P VS LS C VLACGDVEG KFD I L FNR VQA I Q KKS GN FD LLLC 
VGNFFGSTQDAEWEEYKTGIKKAPIOTYVLGANNOFTVKYPnna 

DGCELAENITYLGRKGIFTGSSGLQIVYLSGTESLNEPVPGYSF 

SPKDVSSLRMMLCTTSQFKGVDILLTSPWPKCVGNFGNSSfiFVn 

TKKCGSALVSSLATGLKPRYHFAALEKTYYERLPYRNHIILQEN 

AQHATRFIALANVGNPEKKKYLYAFSIVPMKLMDAAELVKQPPD 

VTEN P YR KS GQE AS I GKQ I LAP VE E S ACQ FFFDLNE KQGR KR S S 

TGRDSKSSPHPKQPRKPPQPPGPCWFCLASPEVEKHLWNIGTH 

CYLALAKGGLSDDHVLILPIGHYQSWELSAEWEEVEKYKATL 

RRFFKSRGKWCWFERNYKSHHLQLQVIPVPISCSTTDDIKDAF ' 

ITQAQEQQIELLEIPEHSDIKQIAQPGAAYFYVELDTGEKLFHR 

IKKNFPLQFGREVLASEAILNVPDKSDWRQCQISKEDEETLARR 

FRKDFEPYDFTLDD 


6905 


1 


226 


VSKTGEAETITSHYLFAIiGVYRTLYLFNWIWRYHFEGFFDLIAi! 
VAGLVQ TVL YCDFF YL YI TKVL KG KKL S LPA 


6906 


3 


611 


SYDDHNGHIDFITAASNLRAKMYSIEPADRFKTKRIAGKIIPAI 
ATTTATVSGLVALEMIKVTGGYPFEAYKNWFLNLAIPIWFTET 
TE VRKTK I RNG I S FT I WDR WTVHGKEDFTLLDF INAVKEKYG I E 
PTMWQGVKMLYVPVMPGHAKRLKLTMHKLVKPTTEKKYVDLTV 
SFAPDIDGDEDLPGPPVRYYFSHDTD 


6907 


2 


2228 


LRGVPVWAAGAFRFSSGEESTSHLIMSRRSQRLTRYSQGDDDGS 
SSSGGSSVAGSQSTLFKDSPLRTLKRKSSNE4XRLSPAPQLGPSS 
DAHTSYYSESLVHESWFPPRSSLEELHGDANWGEDLRVRRRRGT 
GGSESSRASGLVGRKATEDFLGSSSGYSSEDDYVGYSDVDQQSS 
SSRLRSAVSRAGSLLWMVATSPGRLFRLLYWWAGTTWYRLTTAA 
SLLDVFVLTRRFSSLKTFLWFLLPLLLI.TCLTYGAWYFYPYGLQ 
TFHPALVSWWAAKDSRRAPEGWEARDSSPHFQAEQRVMSRVHSL 
ERRLE ALAAE FSSNWQ KEAMRLERLELRQGAPGQGGGGGLSHED 
TLALLEGL VS RRE AAL KE D FRR E TAAR I QEE LS ALRAEHQQDS E 
DLFKKIVRASQESEARIQQLKSEWQSMTQESFQESSVKELRRLE 
DQLAG LQQ ELAALALKQS S VAE E VGLL PQQ I QAVRDD VE S Q FP A 
WISQFLARGGGGRVGLLQREEMQAQLRELESKILTHVAEMQGKS 
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(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
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S=*Serine, T* Threonine , V«Valine, 
W« Tryptophan, Y=Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








AREAAASLSLTLQKBGVIGVTfiEQVHHIVKQALQRYSEDRIGIaA 
DYALESGGASVISTRCSETYETKTALLSLFGIPLWYHSQSPRVI 
LQPDVHPGNCWAFQGPQGFAWRLSARIRPTAVTLEHVPKALSP 
NSTISSAPKDFAT PGPDEDLOOPnTLT/SKPTVnnnrspPTnTmjT? 

QAPTMATYQWELRILTNWGHPEYTCIYRFRVHGEPAH 


6908 


3 


780 


QVPSAAWLMAVCGLGSRLGLGSRLGLQGCFGAARLLYPRFQSRG 
PQGVEDGDRPQPSSKTPRIPKIYTKTGDKGFSSTFTGERRPKDD 
QVFEAVGTTDELSSAIGFALELVTEKGHTFAEEIiQKIQCTLQDV 
GSALATPCSSAREAHLKYTTFKAGP I LELEQW IDKYTSQLPPLT 
AFILPSGGKISSALHFCRAVCRRAERRWPLVQMGETDANVAKF 
LNRLSDYLFTLARYAAMKEGNQEKIYKKNDPSAESEGL 


6909 


3 


409 


GRLLAVGTDLYGQRSSAPEQELLVQDATPVSNSLLPEKAFSDIP 
SPYLRGTIKMMOAVROAFODODnRRTWnfiRPT.TMaziTPnnr'T vn 
LCWDTI KRS SQTGE WQN I AIMTEEPELS PAYLI S EAMRRS RMS 
LYC 


6910 


1 


10*8 


LVPVWIDSYYYGKLVIAPLNIVLYNIFTPHGPDLYGTEPWYFY 
LINGFLNFNVAFALALLVLPLTSLMEYLLQRFHVQNLGHPYWLT 
LAPMYIWFIIFFIQPHKEERFLFPVYPLICLCGAVALSALQHSF 
LYFQKCYHFVFQRYRLEHYTVTSNWLALGTVFLFGLLSFSRSVA 
LFRG YHG P LD L Y P E F YR I ATD PT I HTVPEGRP VNVCVG KE W YRF 
PSSFLLPDNWQLQFIPSEFRGQLPKPFAEGPLATRIVPTDMNDQ 

NLEEPSRYIDISKPHYT.VnT.nTMPPTDPPDlTVCCMirPPLiJTCr AV 
RPFLDASRSSKLLRAPYVPP"L^rinYT\7WNrVTTT vddvi votdv 
KSGG 


6911 


1184 


966 


GEDAEEMETGNVANLISIFGSSFSGLLRKSPGGGREEEEGEESG 
PEAAE PGQ I CCDKPVLRDMNPWSTAI VAF 


6912 


1 


844 


AMKPVETHSFQMLFT I LSTGSALKAQS YEDAYRCI KSS ILLGS I 

GESGELVCTKPIPCQPTHFWNDENGNKYRKAYFSKFPGIWAHGD 
YCR I N P KTGG I VM LGRS DGTLNPNGVR FGS S E I YN I VE S FE E VE 
DSLCVPOYNKYREERVTLPL.KMARnHAPnpnr.VK'nTnnATPMriT 
S ARHVPSL I LE TKGI P YTLNGKKVEVAVKQ 1 1 AGKAVEQGGAFS 
NPETLDLYRDIPELQGF 


6913 


1643 


. 1558 


KKSHEESHKEELSYGAQASLPLPCSDFR 


6914 


1251 


615 


ELAAECKSAGYPGTLIPYRCDLSNEEDILSMFSAIRSQHSGVDI 
CINNAGIARPDTLLSGSTSGWKDMFNVNVLALSICTREAYQSMK 
ERNVDDGHIININSMSGHRVLPLSVTHFYSATKYAVTALTEGLR 
QELREAQTHIRATCISPGWETQFAFKLHDKDPEKAAATYEQMK 
CLKPEDVAEAV I YVLST PAH I Q IGD I QMRPTEQVT 


6915 


254 


652 


GRSLSFKTFLIWVLISIYQGGILMYGALVLFESEFVHVVAISFT 
AL I LTE LLM VALTVRTWH WLMWAE FLS LGC Y VS S LAFLN E Y FD 
VAFITTVTFLWKVSAITWSCLPLYVLKYLRRKLSPPSYCKLAS 


6916 


254 


652 


GRSLSFKTFLIWVLISIYQGG1LMYGALVLFESEFVHWAISFT 
AL I LTE LLMVALTVRTWHWLM WAE FLS LGCYVS S LAFLNE Y FD 
VAFI TT VTFL WKVSAIT WS CLPL YVLKYLRRKLS P PS YCKLAS 


6917 


254 


652 


GRSLS FKTFLIWVL I S I YQGG I LMYGALVLFESEFVHWA I S FT 
AL1LTELLMVALTVRTWHWLMWAE FLSLGCYVS S LAFLNE YFD 
VAF I TT VTFL W KVS AI T WS CLPL YVLK YLRR KLS P P S YCKliAS 


6918 


28 


921 


PEAGTRSWREPDPEDLRRFLLSAACRSFPQWLPGGGGGQVSSCS 
DTDVP YLLLAVKS E PGRFAERQAVRETWGSPAPGI RLLFLLGS P 
VGEAGPDLDSLVAW5SRRYSDLLLWDFLDVPFNQTLKDLLLLAW 
LGRHC PT VS F VLRAQDDAFVHT PALLAHLRALP PAS ARS L Y LG E 
VFTQAMPLRKPGGPFYVPESFFEGGYPAYASGGGYVIAGRLAPW 
LLRAAAR VAP FP FEDVYTGLCIRALGLVPQAHPGFLTAWPADRT 
ADHCAFRNLLLVRPLGPQAS I RLWKQLQDPRLQC 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
Co first 
amino acid 
residue of 
amino acid 
eequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=» Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline 0=Glutaminp Ara-i n i n<=» 
S=Serine, T=Threonine, V-Valine, 
W=»Tryptophan, Y«Tyro3ine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6919 


850 


41 


QGRRELSGSVFCPFIQQEPKEMLTLSEYHERVRSQGQQLQQLQA " 

ELDKLHKEVSTVRAANSERVAKLVFQRLNEDFVRKPDYALSSVG 

ASIDLQKTSHDYADRNTAyFWNRFSFWNYARPPTVILEPHVFPG 

NCWAFEGDQGQWIQLPGRVQLSDITLQHPPPSVEHTGGANSAP 

RDFAVFFLLSFFTHQGLQVYDETEVSLGKFTFDVEKSEIQTFHL 

QNDPPAAFPKVKIQILSNWGHPRFTCLYRVRAHGVRTSEGAEGS 

AQGPH 


6920 


1418 


591 


EAQG PS KVHLTLKKKK 


6921 


2 


1711 


MNATRS EE QFH V I NHAEQTLR KMEN YLKE KQLCD VLL I AGHLR I 

QYAYTGVLQLKEDTIESLLAAACLLQLTQVI DVCSNFLI KQLHP 
SNCLGIRSFGDAQGCTELLNVAHKYTMEHFIEVIKNQEFLLLPA 
NE I S KLLCSDD INVPDEETI FHALMQWVGHDVQNRQGELGMLLS 
YIRLPLLPPQLLADLETSSMFTGDLECQKLLMEAMKYHLLPERR 
S MMQ S PRT KPRKST VG AL Y A VGGMD AMKGTTT I E KYD LRTNSWL 
HTGTMNGRRLOPRVAVTDMKT.VVVnnprjnT vtt MT\7T?ni?KTDi7nv 

IWTVMPPMSTHRHGLGVATLEGPMYAVGGHDGWSYLNTVERWDP 
EGRQWNYVASMSTPRSTVGWALNNKLYAIGGRDGSSCLKSMEY 
FDPHTNKWSLCAPMSKRRGGVGVATYNGFLYWGGHDAPASNHC 
SRLSDCVERYDPKGDSWSTVAPLSVPRDAVAVCPLGDKLYWGG 
YDGHT YLNT VES YDAQRNE WKEE VP VNIGRAGACWWKL P 


6922 


1075 


369 


LTPPAGIRHEVRDREREREREREREKFPLDSTGSELKQNIHSIT " 

HTiPP AMnK'A/TVIVK'nT.Zi DT^nVTT D t? T V^TTdr* 7\ u'TMnnrrmTMnt rr >> 
vjjjit r rti'iyivvni i\OJ_i/-iiriliUJ\. J. i-iKt-L a. V looAKXMoijtjbTINDVLiA 

VNTPKDAAQQDAKAEENKKEPLCRQKQHRKVLDKGKPEDVMPSV 
KGAQERLPTVPLSGMYWKSGGKVRLTFKLEQDQLWIGTKERTEK 
L P MGS I KNWS E P I EGHE D YHMMAFQ LG PT EAS Y YWVY W V PTQ Y 
VDAI KDTVLGKWQYF 


6923 


2469 


1660 


LGL F C I L P I D TL C AVLERDTL SIRES R LFGAWRWAEAE CQ RQQ " 
LPVTFGNKQKVLGKALSLIRFPLMTIEEFAAGPAQSGILSDREV 
VNLFLHFTVNPKPRVEYIDRPRCCLRGKECCINRFQQVESRWGY 
SGTS DR I R FT VNRR I S I VG FGT > Yfi 3 i va prn vnvxr t n t t i? vt? v \e 

QTLGQNDTGFSCDGTANTFRVMFKEPIEILPNVCYTACATLKGP 
DSHYGTKGLKKWHETPAASKTVFFFFSSPGNNNGTSIEDGQIP 
EIIFYT 


6924 


2210 


1235 


PEERVICFVEYYLTAFHEGRKGALAKKPYNPIIGETFHCSWEVP 
KDRVKPKRTASRSPASCHEHPMADDPSKSYKLRFVAEQVSHHPP 
ISCFYCECEEKRLCVNTHVWTKSKFMGMSVGVSMIGEGVLRLLE 
HGEEYVFTLPSAYARS ILTI PWVELGGKVSINCAKTGYSATVI F 
HTKP F YGGKVHR VTAE VKHNPTNT I VCKAHGE WNGTLE FT YNNG 
ETKVIDTTTLPVYPKKIRPLEKQGPMESRNLWREVTRYLRLGDI 
DAATEQKRHLEEKQRVEERKRENLRTPWKPKYFIQEGDGSGILQ 
SPLESTLMGLEVQS FPV 


6925 


2 


1653 


RGGAAG AAME PDSVIEDKTIELMCSVPRS LWLG CANL VE SM CAL 
SCLQSMPSVRCLQISNGTSSVIVSRKRPSEGNYQKEKDLCIKYF 
DQWS E S DQ VE FVEH LI S RM CH YQHGH I NS YLKPMLQRD F I TAL P 
EQGLDHIAF^ILSYLDARSLCAAELVCKEWQRVISEGMLWKKLI 
ERMVRTDPLWKGLSERRGWDQYLFKNRPTDGPPNSFYRSLYPKI 
IQD I ET I ESNWR CGRHNLQR IQCRS ENS KGVYCLQYDDEKI I SG 
LRDNSIKIWDKTSLECLKVLTGHTGSVLCLQYDERVIVTGSSDS 
TVRVWD VWTGEVLNTLIHHNEAVLHLR FSNGLM VTCSKDRS I AV 
W DMAS ATD I TLRR VL VGHRAAVNWD FDD KYI VS ASGDRT I KVW 
S TST CE FVRT LNG H KRG I ACLQ YRDRLWS GSSDNT I R LWD I EC 
GACLRVLEGHEELVRCIRFDNKRIVSGAYDGKIKVWDLQAALDP 
RAPASTLCLRTLVEHSGRVFRLQFDEFQIISSSHDDTILIWDFL 
NVPPS AQNETRS PSRTYTY I SR 


6926 


1 


733 


SGRVAMDGLGLQFPEQGFPAGPPLLPPHMGGHYRDCQSLGAPPL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine / C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NsAsparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=*Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








DGYPL.PTPDTSPLDGVDPDPAFFAAPMPGDCPAAGTYSYAQVSD 
YAGPPEPPAGPMHPRLGPEPAGPSIPGLLAPPSALHVYYGAMGS 
PGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPTPPPEALPCRDGT 
DPSQPAELLGEVDRTEFEQYLHFVCKPEMGLPYQGHDSGVNLPD 
SHGAI SS WS DASSAVY YCNYPDV 


6927 


2 


1484 


LTLCGDI QLMLAONANNRAAHLE E FHYOT KFDOFTT.H<3T .hp pcq 

CQGFAWATDLSTDLESQLSVSCKCYEAANBILQFRDLKSQNPEH 
Y VQVL KRMGN I RNE I G VF YMNQAAALQS E RL VS KS VS AAEQQLW 
KKSFSCFEKG I HNFE S I EDATNAALLLCNTGRLMRI CAQAHCGA 
GDELKREFSPEEGLYYNKAIDYYLKALRSLGTRDIHPAVWDSVN 
WELSTTYFTMATLQQDYAPLSRKAQEQIEKEVSEAMMKSLKYCD 
VDS V S ARQ P L CQ YRAAT I HHRLAS M YHS CLRNQ VGDEHLR KQHR 
VLADLHYSKAAKLFQLLKDAPCELIjRVQLERVAFAEFQMTSQNS 
NVGKLKTLSGALDIMVRTEHAFQLIQKEL1EEFGQPKSGDAAAA 
ADAS PS LNREEVM KLLS I FESRLS FLLLQS I KLLS STKKKTSNN 
IEDDTILKTNKH I YSQLLRATANKTATIiLERINVI VHLLGQLAA 
GSAASSNAVQ 


6928 


1086 


777 


EAIDL INNLLQVKMRKRYS VDKTLSHPWLQDYQTWLDLRELECK ' 
I G ER Y I THE S DDLRWE K YAGEQG LQ Y PTHL I NP S AS HSDTP E TE 
ETEMKALGERVSIL 


6929 


1749 


607 


RDQRGYRDDRSPAREPGDVSARTRSGGGGGRSATTAMPPPVPNG 
NIoHQHDPQDLRHNGNVWAGRPSCS RG PRRAI OKPO P Qrs 

RGPAAGGLCLQPPDGGTCVPEEPPVPPMDWEALEKHLAGLQFRE 
QEVRNQGQARTNSTSAQKNERESIRQKLALGSFFDDGPGIYTSC 
SKSGKPSLSSRLQSGMNLQICFVNDSGSDKDSDADDSKTETSLD 
TPLSPMSKQSSSYSDRDTTEEESESLDDMDFLTRQKKLQAEAKM 
ALAMAKPMAKMQVEVEKQNRKKSPVADLLPHMPHISECLMKRSL 
KPTDLRDMTIGQLQVIVNDLHSQIESLNEELVQLLLIRDELHTE 
QDAMLVDIEDLTRHAESQQKHMAEKMPAK 


6930 


131 


545 


FKDTANVFVSLFQMRNNFRHYFIEPSQLKLFYDVITWIVTQVAI 
S YTWPFVLLS I KPSLTFYSS WYYCLHILGILVLLLLPVKKTQR 
R KNTHEN I QLS QS KKFDEGENS LGQNS FS TTNNVCNQNQE IAS R 
HSSLKQ 


6931 


2 


659 


FVERLPNRPACLLVASGAAEGVSAQSFLHCFTMASTAFNIiQVAT 
PGGKAMEFVDVTESNARWVQDFRLKAYASPAKLES IDGARYHAL 
L I PSC PGALTDLAS SGSLAR I LQHFHS ES KPI CAVGHGVAALCC 
ATNEDRSWVFDSYSLTGPSVCELVRAPGFARLPLWEDFVKDSG 
ACFSASEPDAVHWLDRHLVTGQMASSWPAVQNLLFLCX3SRK 


6932 


2 


1131 


FVDSPGQGEQAEEEEGGIQMNSRMRAHSPAEGASVESSSPGPKK 
S DMCEGCRS LAAGH PG YISHDKETS I KYVSHQHPSH PQLFS I VR 
QACVRSLSCEVCPGREGPIFFGDEOHGFVFSHTFFTKnQT.ni?r:v 
QR W YS 1 1 T I MMDR I YL I NS W PFZj LGK VRG 1 1 DELQG KAL KVFEA 
EQFGCPQRAQRMNTAFTPFLHQRNGNAARSLTSLTSDDNLWACL 
HTSFAWLLKACGSRLTEKLLEGAPTEDTLVQMEKLADLEEESES 
WDNSEAEEEEKAPVLPESTEGRELTQGPAESSSLSGCGSWQPRK 
LPVFKSLRHMRQVGGRGTAHHELRRRANHGLCLPTRLASGPSTL 
KTLQEVTDSLLGGWLMAQGVGGI I 


6933 


1431 


890 


SLNLHCTLPPPPHQYPAGYPSDKEGKKPKGQSKKQPSGTTKRPI 
SDDDCPSASKVYKASDSAEAIEAFQLTPQQQHLIRBDCQNQKLW 
DEVLSHLVEGPNFLKKLEQSFMCVCCQELVYQPVTTECFHNVCK 
DCLQRS F KAQVFS C P ACRHDLGQNYIM I PNE I LQTLLDLFFPG Y 
SKGR 


6934 


3030 


2588 


DRDHSQCGGIRRVALARVSSVKLISKAKIRTVKMTFIIVIiAFIV 
CWTPFFFVQMWSVWDANAPKEASAFIIVMLLASLNSCCNPWIYM 
LFTGHLFHELVQRFLCCSASYLKGRRLGETSASKKSNSSSFVLS 
HRSSSQRSCSQPSTA 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A»Alanine, C=Cysteine, D»Aapartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N^Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine , v=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 


6935 


886 


543 


NSALYVAGGNDGTSCLNSVERYSPKAGAWESVAPMNIRRSTHDL 
VAMDGWLYAVGGNDGSSSLNSIEKYNPRTNKWVAASCMFTRRSS 
VG VAVLELLNFP P PS S PTLS VS STSL 


6936 


1347 


567 


RSHRRQFLSRALLEFFGKSHPPPHRLFRKSLNVGLHYSHIPFLT 
TCLHFLRKRLQKGEVGLSVETSKPQVPVGGLSRKKVPQEPWATV 
MEKRLQEAQLYKEEGNQRYREGKYRDAVSRYHRALLQLRGLDPS 
LP S PL PNLG P QG PALT PEQEN I LHTTQT D C YNNLAACLLQM B P V 
NY E R VRE Y S Q KVLE RQ PDNAKAL YRAG VAF FH LQD YDQARH YLL 
AAVNRQPKDANVRRYLQLTQSELSSYHRKEKQLYLGMFG 


6937 


1 


727 


AVEFRCCPGRDPACFARGWRLDRVYGTCFCDQACRFTGDCCFDY 
DRACPARPCFVGEWSPWSGCADQCKPTTRVRRRSVQQEPQNGGA 
P C P PLEE RAGCL E YS T PQGQD CGHT YVP AF I TTS AFNKE RTRQA 
TSPHWSTHTEDAGYCMEFKTESLTPHCALENRPLTRWMQYLREG 
YTVCVDCQPPAMNSVSLRC3GDGLDSDGNQTLHWQAIGNPRCQG 
TWKKVRRVDQCSCPAVHSFIFI 


6938 


3 


719 


NS R KLELAERVDTDFMQLKKRRQS S EKENDSGTLDTVGAVWDH 
EGNVAAAVSSGGLALKHPGRVGQAALYGCGCWAENTGAHNPYST 
AVSTSGCGEHLVRTILARECSHALQAEDAHQALLETMQNKFISS 
P FLAS E DG VLGGVI VLRS CRCS AE PDSS QNKQTLLVE FLWSHTT 
ESMCVGYMSAQDGKAKTHISRLPPGAVAGQSVAIEGGVCRLGEP 
SELTLQAECEASQRHFRT 


6939 


3 


810 


KVTAPRRPQRYSSGHGSDNSSVLSGELPPAMGRTALFHHSGGSS 
GYESLRRDSEATGSASSAPDSMSESGAASPGARTRSLKSPKKRA 
TGLQRRRLIPAPLPDTTALGRKPSLPGQWVDLPPPLAGSLKEPF 
EIKVYEIDDVERLQRPRPTPREAPTQGLACVSTRLRLAERRQQR 
LREVQAKHKHLCEELAETQGRLMLEPGRWLEQFEVDPELEPESA 
E YLAALERATAALEQCVNLCKAHVMMVTCFDISVAASAAI PGPQ 
EVDV 


6940 


1188 


496 


GKMAAQPLRHRSRCATPPRGDFCGGTERAIDQASFTTSMEWDTQ 
WKGSSPLGPAGIiGAEEPAAGPQLPSWLQPERCAVFQCAQCHAV 
LADSVHLAWDLSRSLGAWFSRVTNNVVLEAPFLVGIEGSLKGS 
TYNLbFCGSCGIPVGFHLYSTHAALAALRGHFCLSSDKMVCYLL 
KTKAIVNASEMDIQNVPLSEKIAELKEKIVLTHNRLKSLMKILS 
EVTPDQSKPEN 


6941 


1 


713 


S LS RADS DPHGP HTCGHVLNV I IGSNVLALAEAQRQAEALGYQA 
WLSAAMQGDVKSMAQFYGLLAHVARTRLTPSMAGASVEEDAQL 
HE LAAELQ I PDLQLEE ALE TMAWG RG P VCLLAGGE PT VQLQGSG 
RGGRNQELALR VGAE LRRWP LG P I D VLFL SGGTDGQDG PTEAAG 
AWVTPEIASQAAAEGLDIATFLAHNDSHTFFCCLQGGAHLLHTG 
MTGTNVMDTHLLFLRPR 


6942 


1 


246 


GDYVERYDPKTDTWTMGAPLSMPTNAVGGCLLGDRLYADGGYDG 
QTYLNTMESYDPQTNEVfTQMASLNIGRAGACNAn^IKQP 


6943 


1 


739 


PMATGDGAKTLA I HVKALTADS I R I TWKATLPAS S FRLS WLRLG 
HS PAGGS ITETLVQGBKTEYLLTALEPKPTY 1 1 CMVTMETTNAY 
VADETPVCAKAETADSYGPTTTLNQEQNAGPMASLPLAGIIGGA 
VALVFLFLVLGAI CW YVHQAGELLTRERAYNRGS RFOCDD YMESG 
TKKDNS ILEIRGPGLQMLPINPYRAKEEYWHTI FPSNGSSLCK 
ATHTIGYGTTRGYRDGGI PDIDYSYT 


6944 


960 


156 


VAN I LLNGVKYES ELTGS S ERAEQPLS VGRLCS TI CNMP KALRT 
L C VNH FIjG WL S FEGMLLF YTDFMGE WFQGDP KA PHTS EA YQKY 
NSGVTMGCWGMCIYAFSAAFYSAILEKLEEFLSVRTLYFIAYLA 
FGLGTGLATLSRNLYWLSLCITYGILFSTLCTLPYSLLCDYYQ 
S KKFAG S S ADGTRRGMGVD I S LLS CQ Y FLAQ I L VS LVLG P LTS A 
VGSANGVMYFSSLVSFLGCLYSSLFVIYEIPPSDAADEEHRPLL 
LNV 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A= Alanine, OCysteine, D^Aspartic Acid, E« 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine , N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\apoosible nucleotide insertion) 


6945 


2067 


179 


EGEDRGLPRTMGAALGTGTRtAPWPGRACGALPRWTPTAPAQGC 
HSKPGP7VRPVPLKKRGYDVTRNPHLNKGMAFTLEERLQLGIHGL 
IPPCFLSQDVQLLRIMRYYERQQSDLDKYIILMTLQDRNEKLPY 
R VLT S D V E KFM P I VYT PT VG LiACOH YGLTP R TJPPGT.VTT runvr 

HLATMLNSWPEDNIKAVWTDGERILGLGDLGCYGMGIPVGKLA 
LYTACGGVNPQQCLPVLLDVGTNNEELLRDPLYIGLKHQRVHGK 
AYDDLLDEFMQAVTDKFGINCLIQFEDFANANAFRLLNKYRNKY 
CM FNDD I QGTAS VAVAG ILAALR I TKNKLSNHVFGFQGAGEAAM 
G \ IAHLLVMALE\KEGVPKA\EATRKI W\MVDF\KGLI VQGRDH 
LNHE KEM FAQD \ H PE VNS LE E WR L VKPTA I IG VAA I AE A \ FTE 
QILRDMASFHERP\lIFALSNPTSKAECTA\EKCYRVTEGPRGF 
FAS\GSPF* G VL I WEMG KTF I PGG RGNNA * R VPRG WQLG VHS PG 
GDPGHIP\DEIFLPDSRAKLPQEVSEQHLSQGRLYP\PLST\IR 
NVFLRIAIKVFD*GYKHNLV\SYYPEPKD\KEAFCKIPGSYTPD 
YDS FYT/VDS YI WAQGKAMKVQTV 


6946 


133 


2551 


SCEYSGITVAPGDPCPGVAHLLAPSMASDTPESLMALCTDFCLR 
NLDGTLGYLLDKETLRLHPDIFLPSEI\CDRLVNEYVELVNAAC 
NF\EPHE\SFFNPLFRDPRKQPASRRIHL\RED\LVQD\QD\LE 
AIRKQDL\VEL\YLTN\CEKLSAKSLQTLRSFSHTLGVP*AFFG 
C\TNILLLRKENPGGL/CEDEYLFNPTCQVLVKDFTFEGFSRLR 
F \LKLGRM I D W VPVES \ LLRPLNSLAALDLSG I QTS DAA\ FLTQ 
WKDSL\VSLVL\YNMDLSDDHIR\VIVQLHKLRHLDISRDRLSS 

YYKFTf T.TPPVT.QT.R^/PlWT/aMT.MCT.'nT ar*\ umtt PVTnim 

x uvr i\xjit\Ci\ usLtc vy]\iA3WJ^oJjUiy\j\lWIljENCSISKIGKR 
EAGQTSI\EPSK\SSIIPFRGFEGGPLQF\LGVF*GIFCGRLTH 
I PAY KVSGDKNEEQ VLNAI EAYTEHRPE I TSRAINLLFD IARIE 
RCNQLLRALKLVI TALKCHK YDRN I QVTGS AALF YLTNS E YRS E 
QS VKLRRQVI QWLNGMES YQE VTVQRNCCLTLCNFS I PEELEF 
QYRRVNELLLS ILNPTRQDES IQRIAVHLCNALVCQVDNDHKEA 
VGKMGFWTMLKLTQKKLLDKTCDQVMEFSW\SALWNITDETPD 
NCEMFLNFNGMKLFLDCLNEFPEKQELHRNMLGLLGNVAEVKEL 
RPQLMTSQFISVFSNLLESKADGIEVSYNACX3VLSHIMFDGPEA 
WGVCEPQREEVEERMWAAIQSWDINSRRN1NYRSFEPILRLLPQ 
GISPVSQHWATWALYNLVSVYPDKYCPLLIKEGGMPLLRDIIKM 
ATARQETKEMARKVIEHCSNFKEENMDTSR j 


6947 


2 


1682 


TSVSTIPRGLASARPQSRSWRCCPVWRRSPGRARGRGLKMLNVP 

P13ADDQnnDVRC mr^D C XT\TTiT vr\r* T3 O T until tot aivooimT nv* 

LKGRLIEVTEEELKKHNKKDDCWICIRGFVYNVSPYMEYHPGGE 
DELMRAAGSDGTELFDQVHRWVNYESMIjKECLVGRMAIKPAVLK 
D YRE E E KKVLNGML P KSQVTDTLAKEG PS Y PS YDW FQTDS L VT I 
/EHIY*TEGYQFRLNNS*SSE*FLYSRNNY*GLLISYTYW/R*A 
MRFRKIFLCGL/CESVGKIEIVLQKKENTSWDFLGHPLKNHNSL 
I PRKDTGL YYRKCQL I SKED VTHDTRLFCLMLPPSTHLQVP I GQ 
nv I JUrJj^x VJxfx 1 Jr VoobJULbbr JUSPvLPNNKYIYFLIK 
IYPTGLFTPELDRLQIGDFVSVSSPEGNFKISKFQELEDLFLLA 
AGTGFTPMVKILNYAIiTDI PSLRKVKLMFFNKTEDDI I WRSQLE 
KLAFKDKRLDVEFVLSAPISEWNGKQGHlSPALLSEFIiKRNLDK 
SKVLVCICGPVPFTEQGVRLLHDLNFSKNEIHSFTA 


6948 


104 


58 


PDGAHS FF PDE YFTCS SLCLS CG VGCKKS MNHGKEG VPHEAKSR 
CRYSHQYDNRVYTCKACYERGEEVSWPKTSASTDSPWMGLAKY 
AWSGYVI E CPNCG WYRSRQYWFGNQDPVDTWRTE I VHVWPGT 
DGFLKDNNNAAQRLLDGMNFMAQSVSELSLGPTKAVTSWLTDQI 
APAYWRPNSQILSCNKCATSFKDNDTKHHCRACX3EGFCDSCSSK 
TRPVPERGWGPAPVRVCDNCYEAR/TRPVSCYRGTSGR*RRRRT 
QETVE 


6949 


152 


4656 


GLRLCLSRPLTRPGDDSVGGSAMASGAGGVG6GGGGKIRTRRCH 
Q G P I KP YQ QGRQQHQG I LS R VTE S VKNI VPGW LQR YFNKNED VC 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


| Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
<A«Alanine, CeCysteine, D«Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«posaiblc nucleotide insertion) 








SCSTDTSEVPRWPENKEDHLVYADEESSNITDGRITPEPAVSNT^ 

EEPSTTSTAST\YPDVLTRVSLYRSHLNFSMLESPALHCQPSTS 

SAFPIGSSGFSLVKEIKDSTSQHDDDNISTTSGFSSRASDKDIT 

VSKNTSLPPLWSPEAERSHSLSQHTATSSKKPAFNLSAFGTLSP 

SLGNSSILKTSQLGDSPFYPGKTTYGGAAAAVRQSKLRNTPYQA 

P VRRQMKAKQLSAQS YG VTS S TARR I LQSLEKMS S PLADAKRI P 

S I VS S PLN S P LDRSG I D I TD FQAKRE KVDSQ Y P P VQRLMT P KPV 

S I ATNRS V Y FKPSLTPSGE FRKTNQR I DKKC S TG YE KNMT PGQN 

REQRESGFSYPNFSLPAANGLSSGVGGGGGKMRRERHAFVASKP 

LEEEEMEGPVLPKISLPITSSSLPTFNFSSPEITTSSPSPINSS 

QALTNKVQMTS PSSTGS PMFKFSS P I VKSTEANVLPPSS IGFTF 

SVPVAKTAELSGSSSTLEPIISSSAHHVTTVNSTNCKKTPPEDC 

EGPFRPAEILKEGSVLDILKSPGFASPKIDSVAAQPTATSPWY 

TRPAISSFSSSGIGFGESLKAGSSWQCDTCLLQNKVTDNKCIAC 

QAAKLSPRDTAKQTGIETPNKSGKTTLSASGTGFGDKFKPVIGT 

WDCDTCLVQNKPEAIKCVACETPKPGTCVKRALTLTvVsESAET 

MTASSSSCTVTTGTLGFGDKFKRPIGSWECSVCCVSNNAEDNKC 

VSCMSEKPGSSVPTSSSSTVPVSLPSGGSLGLEKFKKPEGIWDC 

ELCLVQNKADSTKCLACESAKPGTKSGFKGFDTSSSSSNSAASS 

SFKFGVSSSSSGPSQTLTSTGNFKFGDQGGFKIGVSSDSGYINP 

MSEGF*FSKHIVGFKFGVSSESKPEEVKKDSKNDNFKFGLSFGL 

SNPVFLTPFQFGVSNLGOEEKKEELLKSSrAOPRFOTrrwTMQTD 

VPANTIVTSENKSSFNLGTIETKSVSVAPLKCQTSEAKKEEMPA 
TKGGFSFGNVEPASLPSASVFVLGRTEEKQQEPVTSTSLVFGEG 
KLTMKEPKC\QPVFSFGEFQRQTKDENSSKSTFSFSMTKPSEKE 
S EQP AKATFAFG AQTNTTADQG AAKPD LS YLNNSS S S SS T PATS 
AGGG\lFGSSTSSSNPPVATFVFGQSSNPGSSS\AFGNTAESST 
SQSLLFSQDSKLATTSSTGTAVTPFVFGPGASSNNTTTSGFGFG 
ATTTS S SAGS S FVFGTGPS AP S AS P AFG AKQTPTFG Q S QG AS Q P 
NPPGFGSISSSTALFPTGSQPAPPTFGTVSSSSQPPVFGQQPSQ 
SAPGSGTTPNSSSAFQFGSSTTNFNFTNNSPSGVFTFGANSSTP 
AASAQPSGSGGFPFNQSPAAFTVGSNGKNVFSSSGTS FSGRKI K 
TAVRRRK 


6950 


2585 


411 


PRPGSRSGLCRRAGERGAVRAGGtSRRTRAE* IMDELHYQDTDS " 

DVPEQRDSKCKVKWTHEEDEQLRALVRQFGQQDWKFLASHFPNR 

TDQQCQYRWLRVLNPDLVKGPWTKEEDQKVIEIiVKKYGTKQWTL 

I AKHLKGRLGKQCRERWHNHLNPEVKKSCWTEEEDRI I CEAHKV 

LGNRWAE IAKMLPGRTDNAVKNHWNSTI KRKVDTGGFLSESKDC 

KP P VYL LLE LED KDGLQ S AO P TEGOGS LLTNW PS VP PT T K W pp m 

SEEELiAAATTSKBQEPIGTDLDAVRTPBPLEEFPKREDQEGSPP 

ETSLP YKWWEAANLL I PAVGS SLS EALDL I ES DPDAWCDLS KF 

DLPEEPSAEDSINNSLVQLQASHQQQVLPPRQPSA\LVPSVTEY 

RLDGHTISDLSRSSRGELIPISPSTEVGGSGIGTPPSVLKRQRK 

RRVALS PVTENSTSLSFLDSCWSLTPKSTPVKTLPFS PSQFLNF 

WNKQDTLELESPSLTSTPVCSQKVWTTPLHRDKTPLHQKHAAF 

VTPDQKYSMDNTPHTPTPFKNALEKYGPLKPLPQTPHLEEDLKE 

VLRSEAGIELIIEDDIRPEKQKRKPGLRRSPIKKVRKSLALDIV 

DEDMKLMMSTLPKSLSLPTTAPSNSSSLTLSGIKEDNSLLNQGF 

LQ AK P E KAAVAQ KP RSHFTTPAPMS S AWKT VACGGTRD QL FMQ E 

KARQLLGRLKPSHTSRTLILS 


6951 


1940 


239 


AG P DDTMKR S LQAL YCQ LLS FLL I LALTEALAFA I QE P S P RES L ~ 
Q VL P S GT P PGTM VTAPH S STRHTS WMLT PNPDG P P S QAAAPMA 
TPTPRAEGHPPT\TPSPPSLRQ*PPPILKAP/SSTGPAPAAMAT 
T S S KPEGR PRGQ AAPT I LLT KP PG ATS RPTTAP P RTTTRR P PRP 
PGSSRKGAGNSSRPVPPAPGGHSRSKEGQRGRNPSSTPLGQKRP 
LGKIFQIYKGNFTGSVEPEPSTLTPRTPLWGYSSSPQPQTVAAT 
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amino acid 
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amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
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residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C=Cysteine, D=Aspartic Acid, Ea 
Glutamic Acid; F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V^Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *-Stop 
Codon, /^possible nucleotide deletion, 
\=poQoible nucleotide insertion) 








TVPSNTSWAPTTTSLGPAKDKPGLRRAAQGGGSTFTSQGGTPDA 

TAASGAPVSP/PSCPSAFSAPPPR*PTGWPQP**LLAYCYP\CT 

SRPLSTSSGVFTAATGPTPAAFDTSVSAPSQGIPQGASTTPQAP 

THPSRVSESTISGAKEETVA\PSP*PTGCPVLSPQWYPQPQAIS 

STAWSPPGPGSLGQQGTSPMWPRGTNRSTEPPSA*ARWISPG*S 

WPSACPSPP\LCPADGVLHEEEEEDRQPGEQPEAYGNNTHHPGT 

TPQQAC\RGAAPGEIPVPLKPLRTQLSBPRSPANGDYRDTGMVP 
C 


6952 


658 


304 


PESEGESGEMTDRYTIHSQLEHLQSKYIGT\ATPTPPSGSG\CE 
PTPRLVLLLHGPLRPSQLLRHCGE * EQSASPLQLDGKDASALWT 
ASRQARGELRLCLTTAVRGTSPSVSPVCQSS 


6953 


1512 


349 


NWGKTRALASGKHVPFGKQTNPNKS/VHCDS*G**RRETTQDES 
FSPHFRGKMGGW\KLEKELENTEQPVGGNEG*EHEVTGNLNSD 
PLLELCQCPLCQLDCGSREQLIAHVYQHTAAWSAKSYM\CPVC 
GRALSSPGSLGRHLLIHSEDQRSNCAVCGARFTSHATFNSEKLP 
E VLNM ES LP T VHNEG PSSAEGKDIAFSP P VY PAG I LL VCNNCAA 
YRKLLEAQTPSVRKWALRRQNEPLEVRLQRLERERTAKKSRRDN 
ETP E ERE VRRMRDRE AKRLQRMQE TDE QRARRLQRDRE AMRLKR 
Al ET PEKRQARLI REREAKRLKRRLEKMDMMLRAQFGQDPS AMA 
ALAAEMNFFQLPVSGVELDSQLLGKMAFEEQNSSSLH 


6954 


819 


1 


PPPPFIIPSHPREAGT*AG*KRSGDSECSPPVEQ*A*TRAAAQN 
* PQR * RWTEGNS PQAS AVATPGQGAS PAAPRCTP * PSRRHRRLP 
PGARPPAG*AAPAPTKPWLAGPASAPQPGAAPLSPPAPPLIRTR 
*CAGAAARGRPRRDRSPRPRTPGGCSWSEPRTPPAVSASAQTPS 
DAG*AGGR*GQRQRPSTGR+PPGVGGAGRSHRREGTIPGNPHPR 
AS*RAGWQR*PGP/REWGL*EPQGEEMSGPGGPGGAPPNOvnq<3 
VMQAMSTGI 


6955 


19^8 


782 


PPGRRQVRAQVAGAPVGHWGTRARQVKTGGRRRARRTMPFLGQD 

WRSPGWSWIKTEDGWKRCESCSQKLERENNHCNISHSIILWSED 

GEIFNNEEHEYASKKRKKDHFRNDTNTQSFYREKWIYVHKESTK 

ERHG YCTLGEAFNRLDFS SAIQD I RR FNYWKLLQLI AKS QLTS 

LSGVAQKNYFNILDKrVQKVLDDHHNPRLIKDLLQDLSSTLCrL 

/N*RSREVCISGKHQYLDLPIRNYSRLATTATGSSDD*ASE\NG 

LTLSDLPLHMLNNILYRFSDGWDIITLGQVTPTLYMLSEDRQLW 

KKLCQYHFAEKQFCRHLILSEKGHIEWKLMYFALQKHYPAKEQY 

GDTLHFCRHCSILFWKDSGHPCTAADPDSCFTPVSPQHFIDLFK 
F 


6956 


8605 


3839 


QTSTSIFASPTSPPVLGESVLQDNSFDLNNGSDAEQEEMETQSS 
DFPPSLTQPAPDQSSTIQLHPATSPAVSPTTSPAVSLWSPAAS 
PEISPEVCPAASTWSPAVFSWSPASSAVLPAVSLEVPLTASV 
TS P KAS P VTS PAAAFPTAS PAN KD VSS FLETTADVEE I TGEGLT 
ASGSGDVMRRRIATPEEVRLPLQHGWRREVRIKKGSHRWQGETW 
YYGPCGKRMKQFPEVIKYLSRNWHSVRREHFSFSPRMPVGDFF 
EERDTPEGLQWVQLSAEEIPSRIQAITGKRGRPRNTEKARTKEV 
PKVKRGRGR P PKVK I TELLN KTDNRPLKKLEAQETLNEEDKAKI 
AKSKKKMRQKVQRGECQTTIQGQARNKRKQETKSLKQKEAKKKS 
KAE KEKGKTKQEKLKEKVKREKKEKVKMKEKEE VTKAKPACKAJD 
KTLATQRRLE ERQ RQQM I LEE MKKPTEDMCL TDHQ P L P D FSR VP 
GLTLPSGAFSDCLTIVEFLHSFGKVLGFDPAKDVPSLGVLQEGL 
LCQGDSLGE VQDLLVRLLKAALHD PGFPS YCQSLK I LGEKVS E I 
PLTRDNV S E I LRCFLMAYG VE PAL CDRLRTQ P FQAQP PQQKAA V 
LAFL VHELNG S TL 1 1 NE I DKTLES MS S YRKN KW I VEGRLRRL KT 
VLAKR1X3RSEVEMEGPEECLGRRRSSRIMEVTSGMEEEEEEESI 
AAV PGRRGR RDGEVDATAS SIP E LERQ I E KLS KRQLF FRKKLLH 
SSQMLRAVSLGQDRYRRR YWVLP YLAGIFVEGTEGNLVPE EVI K 
KETDSLKVAAHASLNPALFSMKMELAGSNTTASSPARARGRPRK 
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ID 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A^Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine / 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, NaAsparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *^Stop 
Codon, /^possible nucleotide deletion, 
\«possible nucleotide insertion) 








TKPGSMQPRHLKSPVRGQDSEQPQAQLQPEAQLHAPAQPQPQLQ - 

LQLQSHKGFLEQEGSPLSLGQSQHDLSQSAFLSWLSQTQSHSSL 

LSSSVLTPDSSPGKLDPAPSQPPEEPEPDEAESSPDPQALWFNI 

SAQMPCNAAPTPPPAVSEDQPTPSPQQIASSKPMNRPSJ^ANPCS 

PVQFSSTPLAGLAPKRRAGDPGEMPQSPTGLGQPKRRGRPPSKF 

FKQMEQRYLTQLTAQPVPPEMCSGWWWIRDPEMLDAMLKALHPR 

GI REKALHKHLNKHRDFLQEVCLRPSADP I FEPRQLPAFQEGI M 

SWSPKEKTYETDLAVLQWVEELEQRV1MSDLQIRGWTCPSPDST 

REDLAYCEHLSDSQEDITWRGRGREGLAPQRKTTNPLDLAVMRL 

AAL E QNVERR YLRE P LW PTHEWLE KALLS T PNGAP EGTTTE I S 

YEITPRIRVWRQTLERCRSAAQVCLCLGQLERSIAWEKSVNKVT 

CLVCRKGDNDEFLLLCDGCDRGCHIYCKRPKMEAVPEGDWFCTV 

CLAQQVEGEFTQKPGFPKRGQKRKSGYSLNFSEGDGRRRRVLLR 

GRE S PAAGPRYS EEGLS PSKRRRLS MRNHHSDLTFCE 1 1 LMEME 

SHDAAWPFLEPVNPRLVSGYRRIIKNPMDFSTMRERLXiRGGYTS 

SEEFAADALLVFDNCQTFNEDDSEVGKAGHIMRRKFE\SRWEEF 

YQG KQGQS VRQGRWG VTLWHL PPT FQT KTCH FH LLM L P W VQTQ V 

RYNPDF 


6957 


82 


3514 


HLIVAMPEPTKKEENBVPAPAPPPEEPSKEKEAGTTPAKDWTLV 
ET P PGEEQAKQMANS QLS I LF I EKPQGGTVKVGEDI TFI AKVKA 
EDLSEKPTINGSRKWMDLASKAGKHLQLKETFERHSRVYTFEMQ 
I I KAKDNFAGNYRCEVTYKDKFDSCSFDLEVHESTGTTPNIDIR 
SAFKRSGEGQEDAGELDFSGLLKRREVKQQEEEPQVDVWELLKN 
TKPSEYEKIAFQYESPTCSGMLKRLKRSIREEKKSAAFAKILDP 
VYQVD KGGR VR F WE LAD P KLE VKWN KNGQELR P S TKY I F E DTR 

CQSILNIDNCQMTDDSEYYVTAGDEKCSTELLVREPPIMVTKQL 
EDTTD YCGER VELECEVS EDDAQVKWFKNGEE 1 1 LVQTRYR I RV 
EGKKHILIIEGATKADAADYSVMTTGGQSSAKLSVDLKPLKILT 
PLTDQT VNLG KE I CLKCE ISENI PGKWTKNGLP VQESDRLKWH 
KGRIHKLVIDHALTEDEGDYVFAPDAYNVTLPAKVHVIDPPKI I 
LDGLDADNTVTVIAGNKLRLEIPISGEPPPKAMWSRGDKAIMEG 
SGRIRTESYPDSSTLVIDIAERDDSGVYHINLKNEAGEAHASIK 
VKWDFPDPPVAPTVTEVGDDWCIMNWEPPAYnnRQDTT pvt?tt? 

RKKKQSSRWMRLNFDLCKETTFEPKKMIEGVAYEVRIFAVNA\I 
GISKPSMPSRPFVPLAVTSPPTLLTVDSVTDTTVTMRWRPPDHI 
GAAGLDGYVLEYCFEGSTSAKQSDENGEAAYDLPAEDWIVANKD 
LIDKTKFTITGLPTDAKIFVRVKAVNAAGASEPKYYSQPILVKE 
IIEPPKIHSPKHLKQTYIRRVGDRVILVIPFQGKPRPELTWKKD 
GAEIDKNQINIRNSETDTIIFIRKAERSHSGKYDLQVKVDKFVE 
TASIDIRIIDRPGPPQIVKIEDVWGRNVALTWTPPKDDGNAAIT 
GYTIQKADKKSMEWLRVIEHIIEPVPHTELVIGNEYYFRVFSEN 
MCGLSEDATMTKESAVIARDGKIYKNPVYEDFDFSEAPMFTQPL 
VNRLCHSGYMATLNCSVRGNPKPKITWMKNKVAIVDDPRYRMFS 
NQGVCTLEIRKPSPYDGGTYCCKAVNDLGTVEIECICLPVTfVTAri 


6958 


274 


1663 


PRTSRVKTEGSQGSSAMDFSVKVDIEKEVTCPICLELLTEPLSL"" 

DCGHSFCQACITAKI KESVI ISRGESSCPVCQTRFQPGNLRPNR 

HLAN1VERVKEVKMSPQEGQKRDVCEHHGKKLQIFCKEDGKVIC 

WVCELSQEHQGHQTFRINEWKECQEKLQVALQRLIKENQEAEK 

LEDD1RQERTAWKNYIQIERQKILKGFNEMRVILDNEEQRELQK 

LEEGE VNVLDNLAAATDQLVQQRQDAS TL I S DLQRRLRGSS VEM 

LQDVIDVMKRSESWTLKKPKSVSKKIiKSVFRVPDLSGMLQVLKE 

LTDVQYYWVDVMLNPGSATSNVAISVDQRQVKTVRTCTFKNSNP 

CDFSAFGVFGCQYFSSGKYYWEVDVSGKIAWILGVHSKISSLWK 

RKSSGFAFDPSVNYSKVYSRYRPQYGYWVIGLQNTCEYNAFEDS 

SSSDPKVLTLFMAV\LPWLGFS 


6959 


1 


1469 


S L VHWE FGRGI ED F P YLFFQL THCQQR I CS VTQAG VQ WCDHS S 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(AoAlanine, OCysteine, D^Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=*Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M^Methionine, N^Asparagine, 
P=Proline, Q=Glut amine, R=Arginine, 
S=Serine, T=Threonine , V^Valine, 
W=Tryptophan, Y^Tyrosine, X-Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 

\=DOGsible nucleohirfp \ n^rf i r»r* \ 








LQPQTPGLNQSSHLSLLSSRDYRMLSS1TOEWFWQDRFWLPPNVT 
WTELEDRDGRVYPHPQDLLAALPLALVLloAMRLAFERFIGLPLS 
R WLG VRDQTRRQ VKPNATLE KH FLTEGHR P KE PQLS LLAAQCGL 
TLQQTQRWFRRRRNQDRPQLTKKFCEASWRFLFYLSSFVGGLSV 
LYHESWLWAPVMCWDRYPNQLTLSCPAADSEA\SLYWWYLLELG 
FYLSLLIRLPFDVKRKGGGPSSIKPRPHYDPPSTA\DFKEQVIH 
HFVAVILMTFSYSANLLRIGSLVLLLHDSSDYLLEACKMVNYMQ 
YQQVCDALFLI FSFVFFYTRLVLFPTQI LYTTYYES ISNRGPFF 
GYYFFNGLLMLLQLLHVFWSCLILRMLYSFMKKGQMEKDIRSDV 
EESDSSEEAAAAQEPLQLKNGTAGGPRPAPTDGPRSRVAGRLTN 
RHTTAT 


6960 


387 


2068 


AKWAREKEMQEF\TRSFF\RGRPDLSTLTHSIVRRRYLAHSGRS 

PCSDPERKRFR FNS E S ESGS E AS S P DY FG P P AKNGVAS RS HTH P 

KEENPRRA\SKAVEESSDEERQRDLPAQRGEESSEEEEKGYKGK 

TRKKPWKKQAPGKASVSRKQAREESEBSEAEPVQRTAKKVEGN. 

KGTKSLKESEQESEEEILAQKKEQREEEVEEEEKEEDEEKGDWK 

PRTRSNGRRKSAREERSCKQKSQAKRLLGDSDSEEEQKEAASSG 

DDSGRDREPPVQRKSEDRTQLKGGKRLSGSSEDEEDSGKGEPTA 

KGSRKMARLGSTSGEESDLEREVSDSEAGGGPQGERKNRSSKKS 

SRKGRTRSSSSSSDGSPEAKGGKAGSGRRGEDHPAVMRLKRYIR 

ACGAHRNYKKLLGSCCSHKERLSILRAELEALGMKGTPSLGKCR 

LYRRTLDSDEERPRPAPPDWSHMRGIISSDGESN 


6961 


340 


1646 


RPWS S PTMKPNFS LRLR I FNLNCWG I P YLS KHRADRMRRLGDFL 
NQESFDLALLEEVWSEQDFQYLRQKLSPTYPAAHHFRSG1IGSG 
LCVFSKHPIQELTQHIYTLNGYPYMIHHGDWFSGKAVGLLVLHL 
SGMVLNAYVTHLHAEYNRQKDIYLAHRVAQAWELAQFIHHTSKK 
ADWLLCGDLNMHPEDLGCCLLKEWTGLHDAYLETRDFKGSEEG 
NTM V P KNC YVS QQELK P F P FG VR I D YVL Y KAVS G FT I S CKS FET 
TTGFDPHRGTPLSDHEALMATLFVRHSPPQQNPSSTHGP\AERS 
PL/MCVCLKEALDGSLGLGMA\QARWWA\TFA\SWIGLGL\LL 
LALLC VLAAGGGAnFA A T T .T >WTD Q V(2T.\n,W Afi E*VT xnnTr\T?\rKir> 

LYRAQAELQHVLGRAREAQDLGPEPQLYALL\LGQQEGDRTKEQ 


6962 


340 


1646 


RPWSSPTMKPNFSLRLRIPNLNCWGIPYLSKHRADRMRRLGDFL 
NQES FDLALLEE VWSEQDFQ YhRQKhS PTYPAAHHFRSG IIGSG 
LCVFSKHPIQELTQHIYTLNGYPYMIHHGDWFSGKAVGLLVLHL 
SGMVLNAYVTHLHAEYKRQKDIYLAHRVAQAWELAQFIHHTSKK 
ADWLLCGDLNMHPEDLGCCLLKEWTGLHDAYLETRDFKGSEEG 
NTMVPKNCYVSQQELKPFPFGVRIDYVLYKAVSGFYISCKSFET 
TTGFDPHRGTPLSDHEALMATLFVRHSPPQQNPSSTHGP\AERS 
PL / MC VCLKE ALDG S IX3LGMA\ Q ARWWA\T FA\ S YVIGLG L \ LL 
LALLCVLAAGGGAGEAAILLWTPSVGLVLWAGAFYLFHVQEVNG 
L YRAQ AE LQH VLGRARE AQDLG P E P QL YALL \LGQQEGDRTKEQ 


6963 


374 


2618 


RVT PL I LKLLKKPKTAENQKAS EENE I TQPGGS SAKPGL P CLNF 
EAVLSPDPALIHSTHSLTNSHAHTGSSDCDISCKGMTERIHSIN 
LHNFSNSVLETLNEQRNRGHFCDVTVRIHGSMLRAQRCVLAAGS 
PFFQDKLLLGYSDIEIPSWSVQSVQKLIDFMYSGVLRVSQSEA 
LQ I LTAAS I LQ I KTV I DE CTR I VS QNVGD VF PG I QDSGQDTPRG 
TPESGTSGQSSDTESGYLQSHPQHSVDRIYSALYACSMQNGSGE 
RSFYS GAWSHHETALGLPRDHHMEDPS WITR 1HERSQQMER YL 
STTPETTHCRKQPRPVRIQTLVGNIHIKQEMEDDYDYYGQQRVQ 
ILERNESEECTEDTDQAEGTESEPKGESFDSGVSSSIGTEPDSV 
EQQFGPGAARDSQAEPTQPEQAAEAPAEGGPQTNQLETGASSPE 
RSNEVEMDSTVITVSNSSDKSVLQQPSVNTSIGQPLPSTQLYLR 
QTETLTSNLRMPLTLTSNTQVIGTAGNTYLPALFTTQPAGSGPK 
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SBQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 

T ftfa f H on 

corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 

nucleotide 

location 

to first 

amino acid 

residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
ti-nis uiuinc , i-xouicuLiiic , ts,=jjysine, 
L=Leucine, M=Methionine, N=Asparagine, 

P=Ptt>1 inp D=fr1 1 1 1* am i np R — &mi n*i 

r— c l wi xiic f y-uiuLaiiixiic , iv— .rvx. y jlux lie f 

S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y^Tyrosine, X=Unknown, *«Stop 
Codon, /«possible nucleotide deletion, 
\=possible nucleotide insertion) 








PFLFSIjPOPIiAGOOTnPVTVSOPGL^TFTAOIiPAPnPT.A^^lvrJH 
STASGQGEKKPYECTLCNKTFTAKQNYVKHMFVHTGEKPHQCSI 
CWRS FS LKD YLI K\ HMVTHTG VRA YQCS I CNKR FTQKS S LNVHM 
RLHRGEKSYECYICKKKFSHKTLLERHVALHSASNGTPPAGTPP 
GARAG P PG WACTEGTT YVCS VCPAKFDQ I EQFNDHMRMHVS DG 


6964 


1 


178 ' 


SGRPFFFFFSNTDVYFIKKVTNRWTAGSSYKMTRMKSIGKILLL 
QIFIG\NCSMFVLVI 


6965 


757 


208 


NVFIEPRIQGFMKTSAHPGQKHPDFSMGLLFPLLAALEVCSCGS 

PQEKVEVSQLQKA\QAMSFLYDVLQQVFNFSHKALL\CCMEHDL 
PG P T P H FTS S AAG T PGD LLG AGDG RRRS WGQWV I EGSTLALRR Y 
FQESISTLE 






i act 
loo f 


II lALOVRGMPGCPCFGCGMAGPRLLFLTALALELLGRAGGSQP 
ALRSRGTATACRLDNKESESWGALLSGERLDTWICSLLGSLMVG 
LSG VF P LLV I P IjEMGTMLRS EAGAWRLKQLLS falggllgnvfl 
HLLPEAWAYTCSASPGGEGQSLQQQQQLGLWVIAGILTFLALEK 
/HVPGQQGGGDQPGPQQRPHCCCRRAQWRPLSGPAGCRARPRCR 

gpNdikvsgylnllantidnfthgiavaasflvskkigllttma 
illheiphevgdfaillragfdrwsaaklqlstalggllgagfa 
i ctqs pkgveetaawvlp ftsggfl y i alvnvlpdlleeedpw 


6967 


162 


633 


GFLPFKYWILDLSASSRMETDCNPMELSSMSGFEEGSELNGFEG 
TDMKDMRLEAEAVVNDVLFAVl^FVSKSLRaVDDVAYINVETK 
ERNRYCLELTEAGLKWGYAFDQVDDHLQTPYHETVYSLLDTL\ 
S PAYREAFGKR \ LLQRLEALKRDGQS 


6968 


1 


2265 


RGGGGGRGGPGARERERPGEPERTMEAAAGGRGCFQPHPGLQKT 
LEQFHLSSMSSLGGPAAFSARWAQEAYKKESAKEAGAAAVPAPV 
PAATEPPPVLHLPAIQPPPPVLPGPFFMPSDRSTERCETVLEGE 
TI SCFWGGEKRLCLPQI LNSVLRDFSLQQ INAVCDELH I YCSR 
CTADQ LE I LKVMG I LP F S AP S CGL I T KTDAERL CNALL YGGA YP 
PPCKKELAASLALGLELSERSVRVYHE\CFGKCKGL\LVPELYS 
S>i*bAAUiyCJjU \ CKIjMYPPHK.fc V VHSHKALENRTCHWGF \DSA\ 
NWRA Y I LLS QD YTG KEEQ ARLGR \ C LDD VKE KFD YGNKY KRR VP 
RVSSEPPASIRPKTDDTSSQSPAPSEKDKPSSWLRTLAGSSNKS 
LGCVHPRQRLSAFRPWSPAVSASEKELSPHLPALIRDSFYSYKS 
FETAVAPNVALAPPAQQKWSSPPCAAAVSRAPEPLATCTQPRK 
RKLTVDTPGAPETLAPVAAPEEDKDSEAEVEVESREEFTSSLSS 
LSSPSFTSSSSAKDLGSPGARALPSAVPDAAAPADAPSGLEAEL 
EHLRQALEGGLDTKEAKEKFLHEWKMRVKQEEKLSAALQAKRS 
LHQELEFLRVAKKEKLREATEAKRNLRKEIERJjRAENEKKMKEA 
NESRLRLKRELEQARQARVCDKGCEAGRLRAKYSAQIEDLQVKL 
QHAEADREQLRADLLREREAREHLEK\WK\ELQEQLWPRARPE 
AAGSEG\AAELEP 


6969 


1855 


118 


AGTMHGRLKVKTSEEQAEAKRLEREQKLKLYQSATQAVFQKRQA 
GELDES VLELTSQ I LGANPDFATLWNCRREVLQQLETQKS PEEL 
AAljVJUihJjCjr LLbCLiRVNPKo iCsl WHHRCWIjIjORLPEPNWTREL 
EL CARFLEVDERNFHCWDYRRFVATQAAVPPAEELAFTDSL I TR 
NFSNYSSWHYRSCLLPQLHPQPDSGPQGRLPEDVLLKELELVQN 
AFFTDPNDQSAWFYHRWLLGRADPQDALRCLHVSRDEACLTVSF 
SRPLLVGSRMEILLLMVDDSPLIVEWRTPDGRNRPSHVWLCDLP 
AASLNDQLPQHTFRVIWTAGDVQKECVLLKGRQEGWCRDSTTDE 
QLFRCELSVEKSTVLQSELESCKELQELEPENKWCL\LTIILLM 
RALDPLLYEKETLQYFQTLK\AWDPKRATY\LDDLRSKFLLENS 
VL KM E YAEVR V LH LAH KD LT VLCH L EQLLLVTHL DLSHNRLRTL 
PPALAALRCLEDPPPRT\VLQASDNAIESLDGVTNLPRLQELLL 
ClNfNRLQQPAVLQPLASCPRLVLLNLQGNPLCQAVGILEQLAELL 
PSVSSVLT 
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SEQ 
ID 
NO; 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, (^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=lsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S~Serine, T=Threonine, V«Valine, 
W=Tryptophan, Y-Tyrosine, X»Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 


6970 


3 


1528 


SFPPLLSSPSAVGEGKVAVAAPCPGRSECARAKMAYIQLKPLNE 
G FLS RI SGLLliCRWTCRHCCQKC YESSCCQS SEDEVE I LG PFP A 
QTPPWLMASRSSDKDGDSVHTASEVPLTPRTNSPDGRRSSSDTS 
KSTYSLTRRISSLESRRPSSPLIDIKPIBFGVLSAKKEPIQPSV 
LRRTYNPDDYFRKFEPHLYSLDSNSDDVDSLTDEEILSKYQLGM 
LH FS TQ YDL LHNHLTVR V I EARDL P P P I SHDG S RQDMAHS NP Y V 
KICLLPDQKNSKQTGVKRKTQKPVFEERYTFEIPFLEAQRRTLL 
LTVYDFDKFSRHCVIGKVSVPLCEVDLVKGGHWWKALIPSSQNE 
VELGELLLSLNYLPSAGRLNVDVIRAKQLLQTDVSQGSDPFVKI 
QLVHGLKLVKTKKTSFLRGTIDPFYNESFSFKVPQEELENASLV 
FTVFGHNMKSSNDFIGRIVIG\QYSSGP\SEPNHWRRMLNTHRT 
AVEQWHSLRSRAECDRVSPASLEVT 


6971 


37 


3702 


ACFYVPGSRSFKLIPRHGLVNMGRSGKLPSGVSAKLKRWKKGHS 
SDSNPAICRHRQAARSRFFSRPSGRSDLTVDAVKLHNELQSGSL 
RLGKSEAPETPMEEEAELVLTEKSSGTFLSGLSDCTNVTFSKVQ 
RFWESNS AAHKE I CAVLAAVTEVI RSQGGKETETE YFAAL I RKA 
AQHGVCSVLKGSEFMFEKAPAHHPAAISTAKFCIQEIEKSGGSK 
EATTTLHMLTLLKDLLPCFPEGLVKSCSETLLRVMTLSHVLVTA 
CAMQAFHS L FHAR PGL S TL S AELNAQ 1 1 TAL YD YVPSENDLQPL 
LAWLKVME KAH I NLVRLQ WDLGLGHL PRFFGTAVTCLLS PHS QV 
LTAATQSLKEIIiKECVAPHMADIGSVTSSASGPAQSVAKMFRAV 
EEGLTYKFHAAWSSVLQLLCVFFEACGRQAHPVMRKCLQSLCDL 
RLSPHFPHTAALDQAVGAAVTSMGPEWLQAVPLEIDGSEETLD 
FPRSWLLPVIRDHVQETRLGFFTTYFLPLANTLKSKAMDLAQAG 
STVESKIYDTLQWQMWTLLPGFCTRPTDVAISFKGLARTLGMAI 
SERPDLRVT VCQALRTL ITKGCQAEADRAEVSRFAKNFLP I LFN 
LYGQ P VAAGD TP APRRAVLE TI RTYLTI TDTQL VNS LLE KAS EK 
VLDPASSDFTRLSA/LDLVVAIiAPCADEAAISKLYSTIRPYLESK 

TSSPAKRPRLKCLLHIVRKLSAEHKEFITALIPEVILCTKEVSV 
GAR KNAFAL LVE MGHAFLR FG SNQE E ALQCY L VL I Y PGL VGAVT 
MVS C S I LALTH L L FE F KGLMGTS TVE QLLENVCLLLAS RTRD W 
KS ALG F I KVAVT VMDVAHLAKHVQLVMEAIG KLSDDMRRHFRMK 
LRNLFT\ KFI PK\ FGILTWGKKAVGP KE YHRVLVN I RKAEARAK 
RHRALSQAAVEEEEEEEEEEEPAQGKGDSIEEILADSEDEEDNE 
EEERSRGKEQRKLARQRSRAWLKEGGGDEPLNFLDPKVAQRVLA 
TQPG PGRGRKKDHS FKVSADGRLI IREEADGNKMEEEEGAKGED 
EEMADPMEDVIIRNKKHQKLKHQKEAEEEELEIPPQYQAGGSGI 
HRP VAKKAM P GAE Y KAKKAKGD VKKKGR P DP YA Y I P LNRS KLNR 
RKKMKLQGQFKGLVKAAQRGSQVGHKNRRKDRRP 


6972 


2179 


973 


PGGAILLPLWRRTRPREATVPRGAAQRGRARSAEGRIPSSQSPS 
PAEAGGATRSPPPRPPRPARPPGPSAPPLLRSDAGPGATVSAAA 
AAATERARRGATMGAQLSTLGHMVLFPVWFLYSLLMKLFQRSTP 
A I TLE S PD I K YPLRL I DRE 1 1 S HDTRR FR FAL PS PQHI LGL P VG 
QH I YLS AR I DGNL WRP YTP I S S DDD KG FVDL V I KVYF KDTH P K 
FPAGGKMSQYLESMQIGDTIEFRGPSGLLVYQGKGKFAIRPDKK 
SNPIIRTVKSVGMIAGGTGITPMLQVIRAIMKDPDDHTVCHLLF 
ANQTEKDILLRPELEELRNKHSARFKLWYTLDRAPEAWDYGQG\ 
FVNEEMIRDHLPPPE\EEPLVLMCGPPPMIQYACLPNL\DIIVGH 
PTERCFVF 


6973 


1 


1964 


LQPRCAHRGLRAQKCGRPAPGVDAMVLCPVIGKLLHKRWLASA 
S PRRQE I LSNAGLRFE WPS KFKEKLDKAS FATP YG YAMETAKQ 
KALEVANRLYQKDLRAPDWIGADTIVTVGGLILEKPVDKQDAY 
RMLSRFE/ S GREHS VFTGVAI VHCSS KDHQLDTRVS E F YEETKV 
KFSELSEELLWEYVHSGEPMDKAGGYGIQALGGMLVESVHGDFL 
NWGFPLNH FCKQLVKLYYP PR P EDLRRS VKHDS I PAADTFEDL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptic[e"~ 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V-Valine, 
W=Tryptophan, Y»Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








S D VEGGG S E PTQRDAGS RD E KAEAGEAG Q ATAEAB CHRTR E T L P 
PFPTRLLELIEGFMIjSKGHjTACKLKVFDLIjKDEAPQKAADIAS 
KVDASACGMERLLDICAAMGLLEKTEQGYSNTETANVYLASDGE 
YSLHGFI MHNNDLTWNLFTYLE FAIREGTNQHHRALGKKAEDLF 
QDAYYQSPETRLRFMRAMHGMTKLTACQVATAFNLSRFSSACDV 
GGCTGALARELAREYPRMQVTVFDLPDIIEloAAHFQPPGPQAVQ 
IHFAAGDFFRDPLPSAELYVLCRILHDWPDDKVHKLLSRVAESC 
KPGAGLLLVETLLDEEKRVAQRALMQSLNMLVQTEGKERSLGEY 
QCLLELHGFHQVQWHLGGVLDAIL\PPKWPPEAQAACSL 


6974 


3082 


2172 


RSCAAFASFASRPPLELFAPPGSHRSPPGRGVATSAQCALSVRK 
LLAARPGLGTKYQATMVYKTLFALCILTAGWRVQSLPTSAPLSV 
SLPTNIVPPTTIWTSSPQNTDADTASPSNGTHNNSVLPVTASAP 
TSLLPKNISIESREEEITSPGSNWEGTNTDPSPSGFSSTSGGVH 
LTTTLEEHSLGTPEAGVAATLSQSAAEPPTLISPQAPASSPSSL 
STSPPEVFSASVTTNHSSTVTSTQPTGAPTAPESPTEESSSDHT 
PTSHATAEPVPQEKTPPTTVSGKVMCELIDMET\PPPFPG 


6975 


2 


500 


RPRPTVHCCKWALKLETAMETLINVFHAHSGKEGDKYKLSKKEL 
KELLQTELSGFLDVKELML*ATEAL»KTFEEA+ KSPIIQCSSSRS 
SLPPAPQPPPYL+LSAVPFPIHLPLPLLPPQAQKDVDAVDKVMK 
ELDENGDGEVDFQEYWLVAALTVACNNFFWENS 


6976 


1216 


970 


GCQL* VAYGTTENS PVTFAHFPEDTVEOKAES VGR TMPHTPaR T 
MNMEAGTLAKLNTPGELCIRGYCVMU3YWGEPQKTEEAVDQDKW 
YWTGDVATMNEQGFCKI VGRS KDM 1 1 RGGEN I YPAELEDFFHTH 
P KVQE VQ WGVKDDRMGE EI CAC I RLKDGEETTVEE I KAFCKGK 
ISHFKIPKYIVFVTNYPLTISGKIQKFKLREQMERHLNL* IKQQ 
ACPGRLA 


6977 


129B 


588 


SLFINTNLLSNQIRKTSFGMCSEPISDNTEDQKGKIjKTPDFA*R 
7WKKSKHHVNGNRTVEPFPEGTQMAVFGMGCFWGAERKFWVLKG 
VYS TQ VG FAGG Y TSNP T YKEVC S E KTGHAE WR WYQP EHMS FE 
ELLKVFWENHDPTQGMRQGNDHGTQYRSAIYPTSAKQMEAALSS 
KENYQKVLSEHGFGPITTDIREGQTFYYAEDYHQQYLSKNPNGY 
CGLGGTGVS CP VGIKK 


£978 


3 


242 


SFPFRDSRRCGCCKGSSLRHTAVAMVKLSKEAKQRLQQLFKGSQ 
FAIRWGFIPLVIYLGFKRGADPGMPEPTVLSLLWG 


6979 


3917 


1146 


DEAR VRGEAVAAAILS R CRHWSGP P P FPPS PPDR KGLRG TE P WE 
AGPGSGATPGARAMDVRRLKVNELREELQRRGLDTRGLKTELAE 
RLQAALEAEEPDDERELDADDEPGRPGHINEEVETEGGSELEGT 
AQPPPPGLQPHAEPGGYSGPDGHYAMDNITRQNQFYDTQVIKQE 
NESGYERRPLEMEQQQAYRPEMKTEMKQGAPTSFLPPEASQLKP 
DRQQFQSRKRPYEENRGRGYFEHREDRRGRSPQPPAEEDEDDFD 
DTLVAIDTYNCDLHFKVARDRSSGYPLTIEGFAYLWSGARASYG 
VRRGRVCFEMKINEEISVKHLPSTEPDPHWRIGWSLDSCSTQL 
GEE P F S YG YGGTGKKS TNS R FENYGD KFAEND V I G CFAD FECGN 
DVELSFTKNGKWMRTAFft Tnpn7BT.n<7nAr.VDWVT i;Trxxr , 2nn? etj u» 

is v ijjjq l a ivj.vvjrvr»«'iLT.i.r-i.f t\. i,\£RjSinLAs\3^i\Li x trCi V ur &S*sJJ\v a C vie i 

GQRAE P YCS VL PG FTF I QHLPLS ER I RGTVGP KS KAECE I LMMV 
GLPAAGKTTWAI KHAASNPS KKYNILGTNAIMDKMRVMGLRRQR 
NYAGRWDVLIQQATQCLNRLIQIAARKKRNYILDQTNVYGSAQR 
RKMRPFEGFQRKAIVICPTDEDLKDRTIKRTDEEGKDVPDHAVL 
EMKANFTLPDVGDFLDEVLFIELQREEADKLVRQYNEEGRKAGP 
PPEKRFDNRGGGGFRGRGGGGGFQRYENRGPPGGNRGGFQNRGG 
GSGGGGNYRGGFNRSGGGGYSQNRWGNNNRDNNNSNNRGSYNRA 
PQQQPPPQQPPPPQPPPQQPPPPPSYSPARNPPGASTYNKNSNI 
PGSSANTSTPTVSSYSPPQSFGFFPSTFQPSYSQPPYNQGGYSQ 
GYTAPPPPPPPPPAYNYGSYGGYNPAPYTPPPPPTAQTYPQPSY 
NQ YQQ YAQQWNQ Y YQNQGQ W P P Y YGN YD YG S YS GNTQGGT S TQ 


6980 


1 


420 


GTRGRKTGRVAAPSTRRRTGNMQKLQTRSPAMSLSDPGLGYHPT 
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1 SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, OCysteine, DaAsDartic Acid F- 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine f M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine r V«-Valine, 
W^Tryptophan, Y-Tyrosine, X-Unknown, +=Stop 
Codon, /-possible nucleotide deletion, 
\=»possible nucleotide insertion) 








CWTLRWPPLCSLHALHVFHCLFSSRLGTPVSPRLAMDPNCSCEA - 

GGSCACAGSCKCKKCKCTSCKKSCCSCCPLGCAKCAQGCICKGA 
SEKCSCCA 


6981 


10 


1054 


PGRGFRRASLRPAFAARGVFQGGLGQAKQARTRACAALPTPkPS 
APRLLEPQGVFSLFPPPPGPWPNMILTKAQYDEIAQCLVSVPPT 
RQSLRKLKQRFPSQSQATLLSIFSQEYQKHIKRTHAKHHTSBAI 
ES YYQR YLNG WKNGAAPVLLDLANEVDYAPS LMARL1 LER FLQ 
EHEETPPSKSI INSMLRDPSQI PDGVLANQVYQCIVNDCCYGPL 
VDCIKHAIGHEHEVLLRDLLLEKNLSFLDEDQLRAKGYDKTPDF 
I LQ V P VAVEGHI I H W IESKASFGDECS HHA YLHDQ FWS Y WNRFG 
PGLVIYWYGFIQELDCNRERGILLKACFPTNIVTLCHSIA 


6982 


153 


1285 


F PQQD CS APAAPG LAG S E PRRLRA YRRRRQRARGLKR VAW LAP P " " 
PSLLQGLQGWAQAPVDGTLGPEDSRASSPMIQNSRPSLLQPQDV 
GDTVETLMLHPVIKAFLCGSISGTCSTLLFQPLDLLKTRLQTLQ 
P S DHG S RRVG MLAVLI . KWRT P <3 T ,T J~n , w YCZ M Q D c ttto r*\ r n n\ in t 

YFGTLYSLKQYFLRGHPPTALESVMLGVGSRSVAGVCMSPITVI 
KTRYESGKYGYESIYAALRSIYHSEGHRGLFSGLTATLLRDAPF 
SGI YLMFYNQTKNI VPHDQVDATL I PITNFS CGI FAG I LASLVT 
QPADVIKTHMQLYPLKFQWIGQAVTLIFKDYGLRGFFQGGIPRA 
LRRTLMAAMAWTVYEEMMAKMGLKS 


6983 


82 


773 


EMSFLQDPSFFTMGMWSIGAGALGAAALALLLANTDVFLSKPQK 
AALEYLEDIDLKTLEKEPRTFKAKEIiWEKNGAVIMAVRRPGCFL 
CREEAADLS SLKSMLDQLGVPL YAWKEH I RTEVKDFQ P YFKGE 
IFLDEKKKFYGPQRRKMMFMGFIRLGVWYNFFRAWNGGFSGNLE 
GEGFI LGGVPVVGSGKQG I LLEHREKEFGDKVNLLS VT.P A A KM T 
KPQTIiASEKK 


6984 


1845 


1282 


GGRSAYSLPAGSLPRVPATAAAKMASGVQVADEVCRIFYDMKVR 
KCSTPEEIKKRKKAVIFCLSADKKCIIVEEGKEILVGDVGVTIT 
D P F KH FVGM LP E KDCR YAL YDAS FE T KESR KE E LM F FLWAP E LA 
PLKS KM I YAS S KDAI KKK FQG I KH E CQANG PE DLNRAC I AE KLG 
GSLIVAFEGCPV 


6985 


1887 


1324 


RRTAGIYPCFPKPGRTRHALCSWLLLLTGQLAFDDFQESCAMM 
WQKYAGS RRSM PLGAR I L FHGVFYAGG FAI VY YL I QKFHS RALY 
YKLAVE QLQS 1 1 P E AQEALG P P LNI H YL KL IDREN FVD I VD AKL K 
IPVSGSKSEGLLYVHSSRGGPFQRWHLDEVFLELKDGQQIPVFK 
LSGENGDEVKKE 


6986 


642 


1350 


YHLYFKMGDPNSRKKQALNKLRAQLRKKKEStiADQFDFKMYIAF 
VFKEKKKKSALFEVSEVIPVMTNNYEENILKGVRDSSYSLESSL 
ELLQKDWQLHAPRYQSMRRDVIGCTQEMDFILWPRNDIEKIVC 
LLFSRWKESDEPFRPVQAKFEFHHGDYEKQFLHVLSRKDKTGIV 
VNNPNQSVFLFIDROHLOTPKNKATTPVT.r , <?Tr , T.VT DnirnT tuu 
AVGT I EDHLRP YMPE 


6987 


1623 


341 


LEAAEKASRAFKESQRQTDSKNYETENWSPQKSQRRYDMYNTAC 
FLGE I E VGL YT I Q I LQLTP F FHKENEL S KKHMVQ FLS G KWT I P P 
DPRNECYLALSKFTSHLKNLQSDLKRCFDFFIDYMVLLKMRYTQ 
KEIAEIMLSKKVSRCFRKYTELFCHLDPCLLQSKESQLLQEENC 
RKKLEALRADRFAGLLEYLNPNYKDATTMESIVNEYAFLLQQNS 
KKPMTNEKQNS ILANI ILSCLKPNSKLIQPLTTLKKQLREVLQF 
VGLSHQYPGPYFLACLLFWPENQELDQDSKLIEKYVSSLNRSFR 
GQYKRMCRSKQASTLFYLGKRKGLNSIVHKAKIEQYFDKAQNTN 
SLWHSGDVWKKNEVKDLLRRLTGQAEGKLISVEYGTEEKIKIPV 
ISVYSGPLRSGRNIERVSFYLGFSIEGPPGL 


6988 


3 


689 


TQLLRRPAVFVGSAASGIRSGLWSASSGHWCAPAAGRAHAPVPR 
LVRGLGAASTAAPQDAQTGPQPMPRADCIMRHLPYFCRGQWRG 
FGRGS KQ LG I PTAN F P EQ WDNL PAD I S TG I YYGWAS VGSGDVH 
KMWSIGWNPYYKNTKKSMETHIMHTFKEDFYGEILNVAIVGYL 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C»Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F»Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T= Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=>possible nucleotide insertion) 








RPEKNFDSLESLISAIQGDIEEAKKRLELPEHLKIKEDNFFQVS '" 
KSKIMNGH 


6989 


2 


1118 


LMPSDRPLSP S THAS AGS HCHAP PTTARRAF P I PFG S KS NMATL 
KDQL I YNLLKEEQTPQNKI TWGVGAVGMACAI S ILMKDLADEL 
ALVD V I E D KL KG EMMDLQHG S L FLRTP K I VS G KD YNVTANS KL V 
1 1 TAGAR QQEGE S RLNLVQRNVNI FKF 1 1 PNWKYS PNCKLL I V 
SNPVD ILT YVAWKI SGFPKNRVIGSGCNLDSARFRYLMGERLGV 
HPLSCHG WVLGEHGDS S VPVWSGMNVAG VSL KTLHPDLGTDKDK 
EQWKEVHKQWESAYEVIKLKGYTSWAIGLSVADIAESIMKNLR 
R VH P VS TM I KGL YG I KDD VFLS VPC ILGQNG I S DLVKVTLTS EE 
EARLKKSADTLWGIQKELQF 


6990 


719 


258 


THASGMASWLALRTRTAVTSLLSPTPATALAVRYASKKSGGSS 
KNLGGKS SGRRQG I KKMEGHYVHAGNI I ATQRHFRWHPGAHVGV 
GKNKCLYALEEGIVRYTKEVYVPHPRNTEAVDLITRLPKGAVLY 
KTFVHWPAKPEGTFKLVAML 


6991 


169 


451 


RRSSDFHNPGFLSRPVSLRENIHHQVICSTKNKRRNPKKIAYLL 
SSLLMTNLNPNESTENQPVDAYWAFTLDQEFLTYACVEGTGCLF 
CGRHVH 


6992 


944 


510 


RQAPGCSSLAliRQVRQVYCGLVRAPQVQTRPLSSRFVERRGALY 
RSPMNQENPPPYPGPGPTAPYPPYPPQPMGPGPMGGPYPPPQGY 
PYQGYPQYGWQGGPQEPPKTTVYWEDQRRDELGPSTCLTACWT 
ALCCCCLWDMLT 


6993 


1 


374 


QWCVTCPQHNARQGPAVPPGIQAYGAAPFEDLQVDFTEMSKCRG 
DRVWIKNWNVASLCPLWKGPQTVVLSPPTAVKVEGIPAWIHHSH 
VKPAARETWEARPS PDNPFRVTLKKTTSPAPVTPGS 


6994 


346 


1100 


QW PEKDP VMAASS I S S P WGKHVFKA I laMVLVAL I LLHSALAQS R 
RD FAP PGQQKREAP VDVLTQIGRS VRGTLDAWI G PETMHLVSES 
S S Q VLWA I S S A I S VAF FALSG I AAQLLNALGLAG DYLAQG LKL S 
PGQ VQTFLL WGAGAL W Y W £JjS LLLG L VLALLG R I LWG L KLVI F 
LAG F VALMRS V P DP S TRALLLLALL I LYALL S RLTGS RAS GAQL 
E AKVRGLERQ VE ELR WRQRRAAKGAR S VE E E 


6995 


144 


1346 


GS VAVGLS G I MAAQKD LWDAI V I GAG I QG C FTAYHLAKHR KR I L 
LLEQFFLPHSRGSSHGQSRIIRKAYLEDFYTRMMHECYQIWAQL 
EHEAGTQLHRQTGLLLLGMKENQELKTIQANLSRQRVEHQCLSS 
EELKQRFPNIRLPRGEVGLLDNSGGVIYAYKALRALQDAIRQljG 
GIVRDGEKWEINPGLLVTVKTTSRSYQAKSLVITAGPWTNQLL 
RPLGIEMPLQTLRINVCYWREMVPGSYGVSQAFPCFLWLGLCPH 
HIYGLPTGEYPGLMKVSYHHGNHADPEERDCPTARTDIGDVQIL 
SSFVRDHLPDLKPEPAVIESCMYTNTPDEQFILDRHPKYDNIVI 
GAGFSGHGFKLAPWGKILYELSMKLTPSYDLAPFRISRFPSLG 
KA1IL 


6996 


543 


1942 


ETANAEAAARKSAMDWKEVLRRRLATPNTCPNKKKSEQELKDEE 
MDLFTKYYSEWKGGRKNTNEFYKTIPRFYYRLPAENEVLLQKLR 
EESRAVFLQRKSRELLDNEELQNLWFLLDKHQTPPMIGEEAMIN 
Y ENF LKVG E KAGAKC KQF FTA KVFAKLLHTDS YGR I S I MQ F FNY 
VMRKVWLHQTRIGLSLYDVAGQGYLRESDLENYILELIPTLPQL 
DGLEKSFYSFYVCTAVRKFFFFLDPLRTGKIKIQDILACSFLDD 
LLELRDEELSKESQETNWFSAPSALRVYGQYLNLDKDHNGMLSK 
EELSRYGTATMTNVFLDRVFQECLTYDGEMDYKTYLDFVLALEN 
RKEPAALQ Y I FKLLDI ENKG YLNVFS LNYFFRA I QELMKI HGQD 
PVS FQDVKDE I FDMVKPKDPLKISLQDLINSNQGDTVTTIL I DL 
NGFWTYENREALVANDSENSADLDDT 


6997 


370 


1104 


AMELTIFILRLAIYILTFPLYLLNFLGLWSWICKKWFPYFLVRF 
TV I YNE QMAS KKRELFSNLQ E FAG P S G KLSLLE VGCGTGAN FKF 
YPPGCRVTCIDPNPNFEKFLIKSIAENRHLQFERFWAAGENMH 
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Amino acid segment containing signal peptide 

(AoAlanine . CaCVsheinp n=lnnart" i r» TV c t P— 

Glutamic Acid, ^Phenylalanine , G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine T=Thre oniric V— Valine 
W=Tryptophan, Y= Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QVADGS VD VWCTLVLCS VKNQER I LREVCRVLRPGGAF Y FMEH 
VAAE C S TWNY F WQQ VLD PAWHLL FDG CNLTRE S W KALE RAS FS K 
LKLQHIQAPLSWELVRPHIYGYAVK 


6998 


2 


616 


fvsrallrvrsrrhpaeeraXpgrpedapiecpgatncpeplwc 
shlpvpyapptmesrgksasspkpdtkvpqvtteakvppaadgk 
apltkpskkeapaekqqppaapttapakktsakadpallnnhsn 
lkpaptvpsspdatpepkgpgdgaeedeaasggpggrgpwscen 
fnpllvaggvavaaialilgvaflvrkk 


6999 


14 


XO J X 


ettvslntvdsiesfvadinsghwdtvlqaiqslklpdktlidl 
yeqwlelielrelgaarsllrqtdpmimlkqtqperyihlenl 
larsyfdpreaypdgsskekrraaiaqalagevswppsrlmal 
lgqalkwqqhqgllppgmtidlfrgkaavkdveeekfptqlsrh 
ikfgqkshvecarfspdgqylvtgsvdgfievwnfttgkirkdl 
kyqaqdnfmmmddavlcmcfsrdtemlatgaqdgkikvwkiqsg 

nfT.TJP T?T?T?3MJCTfrtl7 r r<* , T.CTrcirr^CGOTT O 7\ C TrnfATi T r> tu/"' Y vrnv 

U^iJ^Kr c*Kiirloi\.oV lULiol? oRJJijoUliioAbr UU-i IKlHGliKSGK 

tlkefrghssfvneatftqdghyiisassdgtvkiwnmkttecs 
ntfkslgstagtditvnsvillpknpehfwcnrsntwimnmq 
gqivrsfssgkreggdfvccalsprgewiycvgedfvlycfstv 
tgklertltvhekdvigiahhphqnliatysedgllklwkp 


7000 


2 


827 


gpgwflelmesegppeserseffsqreeeneeeeaqepeetgp 
knpllqpaltgdveglqkifedpenphheqamqllleedivgrn 
llyaacmagqsdviralakygvnlnekttrgytllhcaaawgrl 
etlkalveldvdiealnfreerardvaarysqtecvefldwada 
rltlkkyiakvslavtdtekgsgkllkedkntilsacraknewl 

blHIEAS INELifr EQRQQIjEDIVTPIFTKMTTPCQVKSAKSVTSH 
DQKRSQDDTSN 


7001 


2056 


844 


RRCLIIAFLKGCFIFIYFIFIFETEFLSCCPGWSAVAQSRLIAN 
FASQVQAI FI LPKDSQVGPDVKSEAAPKRALYES VFGSGE I CGP 
TSPKRLCIRPSEPVDAVWVSVKHDPLPLLPEANGHRSTNSPTI 

v &rr\x voir i \JUOt\rIXl y lOI\rLi±. 1 KoJrrtoxrlJlNWViVaX f 1 fAuljl -K-olM 

APVHIDVGGHMYTSSLATLTKYPESRIGRLFDGTEP1VLDSLKQ 
HYFIDRDGQMFRYILNFLRTSKLLIPDDFKDYTLLYEEAKYFQL 
QPMLLEMERWKQDRETGRFSRPCECLWRVAPDLGERITLSGDK 
SLIEEVFPEIGDVMCNSVNAGWNHDSTHVIRFPLNGYCHLNSVQ 
VLERLQQRGFEIVGSCGGGVDSSQFSEYVLRRELRRTPRVPSVI 


7002 


1043 


498 


PMPSSTRWTTS*TYTDTSSAWACRPTTGTCT*TAAPGPTVRWWP 
TPCSRHQSRRRLTCWCSTSRPCGR*GGLCVRTAPTRPTTSASSS 
SWTSAGTSWPAGRRTGTATSGTATTTSVWPGCGTRMWSTQWSSV 
PRSRSCCSRPATTPPSKPGAPHAPCASSRHLAHGLAPSSPGLPA 
RGAEVC 


7003 


818 


61 


QGRFRAFCWQRDFLQPPGMRLSALUUASKVTLPPHYRYGMSPP 
GSVADKRKNPPWIRRRPVWEPISDEDWYLFCX3DTVEILEGKDA 

G KOG KWO VI P. OR M W VWRR T .NTH YR Y T R KTMD Y R d TM T T> Q TT a p 

LLHRQVKLVDPMDRKPTEIEWRFTEAGERVRVSTRSGRI I PKPE 
FPRADGIVPETWIDGPKDTSVEDALERTYVPCLKTLQEEVMEAM 
GI ketr\ntrrs IGIE PGAEQLLPNFCPSLEG 


7004 


121 


2285 


FLLPVLTSRSLRQPAVPHARLGGVEPAAMKSARAKTPRKPTVKK 
G\PKRTLKTQLG/YYCRVRPLGFPDQECCIEVINNTTVQLHTPE 
GYRLNRNGDYKETQYSFKQVFGTHTTQKELFDWANPLVNDLIH 
GKNGLLFTYGVTGSGKTHTMTGS PGEGGLLPRCLDMI FNS IGSF 
QAKR YVFKS NDRNSMD I QCE VDALLE RQKREAM PNPKTS S S KRQ 
VDPEFADMITVQEFCKAEEVDEDSVYGVFVSYIEIYNNYIYDLL 
EEVPFDPINPNLHNLNCFVKIKNHNMYVAGCTEVEVKSTEEAFE 
VFWRGQKKRRIANTHLNRESSRSHSVFNIKLVQAPLDADGDNVL 
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Amino acid segment containing signal peptide 
(A«Alanine, C-Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline , Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 

WnTifvritoDhan . YoTvrof? i np YaTInVnnum *«»Ql-r4n 

Codon, /opossible nucleotide deletion, 
\=possible nucleotide insertion) 








QE KEQIT I SQLSLVDLAGSERTNRTRAEGNRLREAGNINQSLMT 
LRTCMDVLRENQM YGTNKMVP YRDS KLTHLFKNYFDGEGKVRM I 
VC VNPKAED YEENLQVMRFAEVTQE VEVARPVDKA I CGLTPGRR 
YRNQPRG P \ IGNE PLVTDWLQS FPPLPSCEI LDINDEQTLPRL 

GKLNEKEKMISGQKLEIERLEKKNKTLEYKIEILEKTTTIYEED 
KRNLQQELETQNQKLQRQFSDKRRLEARLQGMVTETTMKWEKEC 
ERRVAAKQLEMQNKLWVKDEKLKQLKAIVTEPKTEKPERPSRER 
DREKVTQRSVSPSPVPVSYL 


7005 


63 


876 


RNMAL YQ RW RCLRLQG LQACR LHTA WS T P P RW LAE RLG LFE EL " 
WAAQVKRLASMAQKEPRTIKISLPGGQKIDAVAWNTTPYQLARQ 

FWHSSTH VLGAAAEQFLGAVLCRG P STE YGF YHDFFLG KERT I R 
GSELPVLERICQELTAAARPFRRLEASRDQLRQLFKDNPFKLHL 
IEEKVTGPTATVYGCGTLVDLCQGPHLRHTGQIGGLKLLSNSSS 
LWRSSG 


7006 


22 


898 


NAFGRHSTAVKMAAAAWLQVLPVILLLLGAHPSPLSFFSAGPAT 
VAAADRS KWH I P I PSG KNYFS FGKI LFRNTT I FLKFDGE PCDLS 
LNITWYLKSADCYNEIYNFKAEEVELYLEKLKEKRGLSGKYQTS 

DKTAMHEPLQTWQDAPYIFIVHIGISSSKESSKENSLSNLFTMT 
VEVKGPYEYLTLEDYPLMIFFMVMCIVYVLFGVLWIiAWSACYWR 
DLLRIQFWIGAVIFLGMLEKAVFYAGFQ 


7007 


2 


1001 


AMTVSGPGTPEPRPATPGASSVEQLRKEGNELFKCGDYGGALAA 
YTQALGLDATPQDQAVLHRNRAACHLKLED YDKAETEAS KAI EK 
DGGD VKAL Y RRSQALE KLGRLDQAVLDLQRCVS LE P KNKVFQEA 
LRNIGGQIQEKVRYMSSTDAKVEQMFQILLDPEEKGTEKKQKAS 
QN LWLAR E DAGAE K I FRSNG VQLLQRLLDMG E TDLMLAALRTL 
VG ICSEHQSRTVATLS I LGTRRWS ILGVESQAVSLAACHLLQV 
MFDALKEGVKKGFRGKEGAIIVGEWKQVWGLLDVTVMEGMGLSQ 
PCj^b rGDQTCSCRLFGIRFGDIILL 


7008 


70 


1478 


CRSALGHERPPPAHLPAGGRRLQTCPRSCRWLGRPPSGLPPGPR 
SPPPLAGPGQKMVQKKPAELQGFHRSFKGQNPFELAFSLDQPDH 
G DS DFGLQ C S AR PDMP AS QP I D I P DAKKRG KKKKRGRATDS FSG 
RFEDVYQLQEDVLGEGAHARVQTCINLITSQEYAVK1IEKQPGH 
1 Kb K V FK b V EML i Q CQGHRNVL EL I E F FE E E DR FYL VFE KMRGG 
SILSHIHKRRHFNELEASVWQDVASALDFLHNKGIAHRDLKPE 
NILCEHPNQVSPVKICDFDLGSGIKLNGDCSPISTPELLTPCGS 
AEYMAPEWEAFSEEASIYDKRCDLWSLGVILYILLSGYPPFVG 

KDLISKLLVRDAKQRLSAAQVLQHPWVQGCAPENTLPTPMVLQR 
WDSHFLLPPHPCRIHVRPGGLVRTVTVNB 


* 7009 


1 


626 


ARQLRNSWVDDFVAAPLIPLSQQIPTGNSLYESYYKQVDPAYTG 
RVGASEAALFLKKSGLSDIILGKIWDLADPEGKGFLDKQGFYVA 
LRLVACZAQSGHEVTLSNIjNLSMPPPKFHDTSSPUWTPPSAEAH 
W AVR VE E KAKFDG I FE S LL P I NGLLS G DKVKP VLMNS KL P LD VL 
GRVWDLSDIDKDGHLVRDEFAVAMHLVYRALE 


7010 


79 


571 


SHTRRAWPETLLSPLCPLLGGGTAMSGGEQKPERYYVGVDVGT 
GSVRAALVDQSGVLLAFADQPIKNWEPQFNHHEQSSEDIWAACC 
WTKKWQG I DLNQ I RGLGFDATCS LWLDKQFHPL P VNQEGDS 
HRNVIMWLDHRAVSQVNRINETKHSVLQYVGG 


7011 


3 


994 


R I QTLPNQNQSQTQPLLKTPPAVLQPIAPQTTFGVQTQPQPQSL 
LQAQISAASITPLLQTQPQPLLQQPQQKAGLLQPPVRIVSQPQP 
ARRLDPPSRFSGRNDRGDQVPNRKDDRSRERERERRRSRERSPQ 
RKRSRERSPRRERERSPRRVRRWPRYTVQFSKFSLDCPSCDMM 
ELRRRYQNLYIPSDFFDAQFTWVDAFPLSRPFQLGNYCNFYVMH 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M-Methionine, N=Asparagine, 
P=Proline / Q=Glutamine, R=Arginine, 
S^Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X-Unknown, *«Stop 
Codon, /^possible nucleotide deletion, 
\=possiblc nucleotide insertion) 








REVESLEKNMAILDPPDADHLYSAKVMLMASPSMEDLYHKSCAL'"" 

AEDPQELRDGFQHPARLVKFLVGMKGKDEAMAIGGHWSPSLDGP 

DPEKDPSVLIKT\AIRCCKALTG 


7012 


1 


2661 


RRAGSVKRGEARLFGPTERQSERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATG TEAG PG TAGGS ENG S EVAAQ PAG LSG P AE VG PGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPQAGPTWPGSATPME 
TGIAETPEG\RRTSRRKRAKVEYREMDESLANLSEDEYYSEEER 
NAKAEKEKKLPPPPPQAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMTSQEAACFPDIISGPQQTQKVFLFIRNRTLQLWLDNPKIQL 
T FE ATLQQLE AP YNSD T VLVHRVH S YLE RHG L I N FG I Y KR I KPL 
PTKKTGKVI I IGSGVSGLAAARQLQSFGMDVTLLEARDRVGGRV 
ATFRKGNYVADLGAMWTGLGGNPMAWSKQVNMELAKIKQKCP 
LYE ANGQAVP KEKDEM VEQEFNRIjLE ATS YLS HQLD FNVLNNKP 
VSLGQALEWIQLQEKHVKDEQIEHWKKIVKTQEELKELLNKMV 
NLKEKIKBLHQQYKEASEVKPPRDITAEFLVKSKHRDLTALCKB 
YDELAETQGKLEEKLQELEANPPSDVYLSSRDRQILDWHFANLE 
FANATPLSTLSLKHWDQDDDFEFTGSHLTVRNGYSCVPVALAEG 
LDIKLNTAVRQVRYTASGCEVIAVNTRSTSQTFIYKCDAVLCTL 
PLGVLKQQPPAVQFVPPLPEWKTSAVQRMGFGNLNKWLCFDRV 
F WD P S VNL FG HVGS TTAS RGE L FL FWNL YKAP I LLALVAGEAAG 
I M EN I SDD V I VGRC LA I LKG I FG S S AVPQP KETWSRWRAD PW A 
RGSYSYVAAGSSGNDYDLMAQPITPGPSIPGAPQPIPRLFFAGE 
HT I RN YPATVHGALLS G LRE AGR I ADQFLG AM Y TL PRQAT PG V P 
AQQSPSM 


7013 


1 


2661 


RRAGSVKRGEARLFGPTERQSERPLRPSAARRPEMLSGKKAAAA 
AAAAAAAATGTEAGPGTAGGSENGS EVAAQ PAGLSGPAEVG PGA 
VGERTPRKKEPPRASPPGGLAEPPGSAGPQAGPTWPGSATPME 
TGIAETPEG\RRTSRRKRAKVEYREMDESLANLSEDEYYSEEER 
NAKAEKEKKLPPPPPQAPPEEENESEPEEPSGVEGAAFQSRLPH 
DRMTSQEAACFPDIISGPQQTQKVFLFIRNRTLQLWLDNPKIQL 
TFEATLQQLEAPYNSDTVLVHRVHSYLERHGLINFGIYKRIKPL 
PTKKTGKVI I IGSGVSGLAAARQLQSFGMDVTLLEARDRVGGRV 
ATFRKGNY VADLGAMWTGLGGNPMAWSKQVNMELAKI KQKCP 
LY EANGQAVPKEKDEMVEQE FNRLLEATSYLSHQLD FNVLNNKP 
VSLGQALEWIQLQEKHVKDEQIEHWKKIVKTQEELKELLNKMV 
NLKE K I KE LHQ Q Y KEAS E VKP PRD I T AE FLVKS KHRDLTAL C KE 
YDELAETQGKLEEKLQELEANPPSDVYLSSRDRQ I LDWHFANLE 
FANATPLSTLSLKHWDQDDDFEFTGSHLTVRNGYSCVPVALAEG 
LDIKLNTAVRQVRYTASGCEVIAVNTRSTSQTFIYKCDAVLCTL 
PLGVLKQQPPAVQFVPPLPEWKTSAVQRMGFGNLNKWLCFDRV 
FWD P SVNLFGHVGS TTAS RG EL FL FWNL YKAP I LLALVAGEAAG 
IMENISDDVIVGRCLAILKGIFGSSAVPQPKETWSRWRADPWA 
RGSYSYVAAGSSGNDYDLMAQPITPGPSIPGAPQPIPRLFFAGE 
HTIRNYPATVHGALLSGLREAGRIADQFLGAMYTLPRQATPGVP 
AQQSPSM 


7014 


3 


3950 


DFEVGDKIRILATLEDGWLEGSLKGRTGIFPYRFVKLCPDTRVE 
ETMALPQEGSLARIPETSLDCLENTLGVEEQRHETSDHEAEEPD 
CIISEAPTSPLGHLTSEYDTDRNSYQDEDTAGGPPRSPGVEWEM 
PLATDSPTSDPTEWNGISSQPQVPFHPNLQKSQYYSTVGGSHP 
HSEQYPDLLPLEARTRDYASLPPKRMYSQLKTLQKPVLPLYRGS 
SVSASRWKPRQSSPQLHNLASYTKKHHTSSVYSISERLEMKPG 
PQAQGLVMEAATHSQGDGSTDLDSKLTQQLIEFEKSLAGPGTEP 
DKILRHFSIMDFNSEKDIVRGSSKLITEQELPERRKALRPPPPR 
PCTPVSTSPHLLVDQNLKPAPPLWRPSRPAPLPPSAQQRTNAV 
SPKLLSRHRPTCETLEKEGPGHMGRSLDQTSPCPLVLVRIEEME 
RDLDMYSRAQEELNLMLEEKQDESSRAETLEDLKFCESNIESLN ' 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D-Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P=Proline, Q-Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
WnTryptophan, YsTyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=«possible nucleotide insertion) 








MELQQLREMTLLSSQSSSLVAPSGSVSAENPEQRMLEKRAKVIE 
ELLQTERDYIRDLEMCIERIMVPMQQAQVPNIDFEGLFGNMQMV 
I KVS KQLLAALE I S DAVGP VFLGHRDELEGT Y KI YCQNHDEAI A 
LLEIYEKDEKI QKH LQDS LADL KS L YNE WGCTNY INLGS FL I KP 
VQRVMRYPLLLMELLNSTPESHPDKVPLTNAVLAVKEINVNINE 
Y KRRKDLVLK YR KGDED S LME K I S KLN IHS 1 1 KKSNR VS SHLKH 
LTGFAPQIKDEVFEETEKNFRMQERLIKSFIRDLSLYLQHIRES 
ACVKWAAVSMWDVCMERGHRDLEOFERVHRYISDOIjPTKrPTfFP 

TERLVISPLNQLLSMFTGPHKLVQKRFDKLLDFYNCTERAEKLK 
DKKTLEELQSARNNYEALNAQLLDELPKFHQYAQGLFTNCVHGY 
AEAHCDFVHQALEQLKPLLSLLKVAGREGNLIAIFHEEHSRVLQ 
QLQVFTFFPESLPATKKPFERKTIDRQSARKPLLGLPSYMLQSE 
ELRAS LLARY P P E KLFQAERNFNAAQDLD VS LLEGDLVGV I KKK 
DPMGSQNRWLIDNGVTKGFVYSSFLKPYNPRRSHSDASVGSHSS 
TESEHGSSSPRFPRQNSGSTLTFNPN\S\MAVSFTSGSCQKQPQ 
DASPPPKEWDQGTLSASLNPSNSESSPSRCPSDPDSTSQPRSGD 
SADVARDVKQPTATPRSYRNFRHPEIVGYSVPGRNGQSQDLVKG 
CARTAQAPEDRSTEPDGSEAEGNQVYFAVYTFKARNPNELSVSA 
NQKLK I LEFKD VTGNTE W WLAE VNG KKGYVP SN Y I RKTE YT 


7015 


1842 


513 


RQ AWH E \ VAAP S WRG ARLVQ S VLR VWQ VG PHVARER V I P FS S LL " 
GFQRR CVS C VAGS AFSG P R LAS AS R S NGQG S ALDHFLG FS Q PDS 
SVTPCVPAVSMITRDEQDVLLVHHPDMPENSRVLRWLLGAPNAG 
KSTLSNQLLGRKVFP VSRKVHTTRCQALGVI TE KETQV I LLDTP 
GI I S PGKQKRHHLELSLLEDPWKSMESADLVWLVDVSDKWTRN 
QLS P QLLR CLTK YS QI PS VL VMNKVDCL KQ KS VL LELTAAL TEG 
VVNGKKLKMRQAFHSHPGTHCPSPAVKDPNTQSVGNPQRIGWPH 
FKE I FMLS ALS QEDVKTLKQ YLLTQAQ PGPWE YHS AVLTS QTPE 
EI CAN I I REKLLEHLPQE VP YNVQQKTAVWEEG PGGEL VI QQ KL 

LVPKESYVKLLIGPKGHVISQIAQEAGHDLMDIFLCDVDIRLSV 
KLLK 


701S 


167 


2513 


ILNAPKPPPPRDSVEAVAAKRDTGGGSWGTGMDVSGQETDWRST 
AFRQKLVS Q I EDAMR KAG VAH S KS S KDME S H VFL KAKTRD E YLS 
LVARL I IHFRD I HNKKSQAS VS DPMNALQS LTGG PAAGAAG IGM 
PPRGPGQSLGGMGSLGAMGQPMSLSGQPPPGTSGMAPHSMAWS 
TATPQTQLQLQQVAAAAAAATARSSSSSSRRRYSSSSSSSNSKQ 
FQAQQS AMQQ\Q FQA \ WQQQQQL\QQQQQQQQHL I KLHHQNQQ 
QIQQQQQQLQRIAQLQLQQQGOOOOOOOOOOOOALOAOPPTnnp 
PMQQPQPPPSQALPQQLQQMHHTQHHQPPPQPQQPPVAQNQPSQ 
LPPQSQTQPLVSQAQALPGQMLYTQPPLKFVRAPMWQQPPVQP 
Q VQQQQTAVQTAQAAQMVAPGVQVSQSS LPMLS S PS PGQQVQTP 
QSMPPPPQPSPQPGQPSSQPNSNVSSGPAPSPSSFLPSPSPQPF 
\ QS PVTARTPQNFS VPS PGPLNTPVNPSSVMSPAGSSQAEEQQY 
LDKLKQLSKYIEPLRRMINKIDKNEDRKKDLSKMKSLLDILTDP 
S KRCPLKTLQKCB I ALE KLKNDMAVPTPP P P P VP PTKQQYLCQP 
LLDAVLAN I RS PVFNHS LYRTF VPAMTAIHGPPI TAP WCTRKR 
RLEDDERQS I PSVLQGEVARLDPKFLVNLDPSHCSNNGTVHLI C 
KLDDKDLPSVPPLELSVPADYPAQSPLWIDRQWQYDANPFLQSV 
HRCMTSRLLQLPDKHSVTALLNTWAQSVHQACLSAA 


7017 


1 


1785 


INLGNTCYMNSVI*ALFMATDFRRQVLSLNLNGCNSLMKKLQH1, 
FAFLAHTQREAYAPRIFFEASRPPWFTPRSQQDCSEYLRFLLDR 
LHEEEKILKVQASHKPSEILECSETSLQEVASKAAVLTETPRTS 
DGEKTLIEKMFGGKLRTHlRCLNCRSTSQKAEAFTDLSLAFWPS 
YSLEYMSCPDCSQSPSIQDGGLMQASVPGPSEEPWYNPTTAAF 
I CDS L VNE KT I GS P PNE F YCS ENTS VPNESNK I L VNKD VPQKPG 
GETTPSVTDLLtfYFLAPEILTGDNQYYCENCASI^NAEKTMQIT 
EEPEYLILTLLRFSYDQKYHVRRKILDNVSLPLVLELPVKRITS 
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amino acid 
sequence 


Predicted end' 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, • 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Lieucine, M=Methionine, NeAsparagine, 
P=Proline, QsGlutamine, R=Arginine, 
S=Serine, T^Threonine , V»Valine, 
W^Tryptophan, Y-Tyrosine, X^Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\=possible nucleotide insertion) 








FSSLSESWSVDVDFTDLSENLAKKLKPSGTDEASCTKLVPYLLS 
SWVHSGISSESGHYYSYARNITSTDSSYQMYHQSEALALASSQ 
SHLLGRDSPSAVFEQDLENKEMSKEWFLFNDSRVTFTSFQSVQK 
ITSRFPKDTAYVLLYKKQHSTNGLSGNNPTSGLWINGDPPLQKE 
LMDAITKDNKLYLQEQELNARARALQAASASCSFRPNGFDDNDP 
PGS CGPTGGGGGGGFNTVGRLVF 


7016 


484 


1066 


SLVFRGNTWSGEAGHHCSALFNLAAYHQLFVGTERIRAPEIIFQ 
PSLIGEEQAGIAETLQYILDRYPKDVQEMLVQNVFLTGGNTMYP 
GM KARM E KELLEMR P FRS S FQVQLASN P VLDAW YGARD WALNHL 
DDNEVWITRKEYEEKGGEY^KEHCASNIYVPIRLPKQASRSSDA 
QAS S KGSAAGGGGAGEQA 


7019 


1048 


335 


APGGFLVTMVF PAPS PPWMLG CCS HEVTAGPPTLCKDMSALVAA 
RMRH I PLAPGS DWRDL PN I EVRLSDGTMARKLR YTHHDRKNGRS 
SSGALRGVCSCVEAGKACDPAARQFNTLIPWCLPHTGNRHNHWA 
GLYGRLEWDGFFSTTVTNPEPMGKQGRVLHPEQHRWSVRECAR 
SQGFPDTYRLFGNI LDKHRQVGNAVP P PLAKA I GLE I KLCMLAK 
ARESASAKI KEEEAAKD 


7020 


1 


2154 


FADSKRKSVLLDKIKNLQVALTSKQQSLETAMSFVARNTFKRVR 
NGFLMRKVAVFFSNTPTRASPQLREAVLKLSDAGITPbFLTRQE 
DRQLINALQ INNTAVGHALVLPAGRDLTDFLENVLTCHVCLDI C 
N I D P S CG FG S WRPS FRDRRAAG S DVD I DMAF I LDS AETTTLFQ F 
NEMKKYIAYIjVRQLDMSPDPKASQHFARVAWQHAPSESVDNAS 

M P PVIf Q T .Tn Vf3 C tfl? VT \mi?T ODnMTAT Ar>frm\T r*c*i\ t nvmT 

l'icc v xvvE»r j Li i uiuoJtviiJvijvjL/r udXbn lyjj^vj l rvAJUOoAIEYXl 
ENVFESAPNPRDLKIWLMLTGEVPEQQLEEAQRVILQAKCKGY 
FF WLG IGR K VNI KE VYTFAS E PND VF F KL VD KS TE LNEE PLMR 
FGRLLPSFVSSENAFYLSPDlRKOCDWFOGDOPTKNTiWPnHK'O 
VNVPNNVTSSPTSNPVTTTKPVTTTKPVTTTTKPVTTTTKPVTI 
INQ P S VKP AAAK PAPAKP VAAKP VATKTAT VR P P VAVKPATAAK 
PVAAKPAAVRPPAAAAAKPVATKPEVPRPQAAKPAATKPATTKP 
MVKMSREVQVFEITENSAKLHWERPEPPGPYFYDLTVTSAHDQS 
LVLKQNLTVTDRVIGGLLAGQTYHVAWCYLRSQVRATYHGSFS 
TKKSQPPPPQPARSASSSTINLMVSTEPLALTETDICKLPKDEG 
TCRDFILKWYYDPNTKSCARE'WYGGCGGNKNKPRcinKPrPKA/ra 
PVLAKPGVISVMGT i 


7021 


2 


338 


VNAVS FFPNGYAFATGSDDATCRLFDLRADQELLLYSHDNI ICG 
ITSVAFSKSGRLLLAGYDDFNCNVWDTLKGDRAGVLAGHDNRVS 
CLGVTDDGMAVATGSWDSFLRIWN 


7022 


2 


856 


VYIGSFWSHPLLIPDNRKLFEAEEQDLFRDIQSLPRNAALRKLN' 

GRIEREHQISPGDFPNLKRMQDQLQAQDFSKFQPLKSKLLEWD 
DMLAHDIAQLMVLVRQEESQRPIQMVKGGAFEGTLHGPFGHGYG 
EGAGEGIDDAEWWARDKPMxDEIFYTLSPVDGKITGANAKKEM 
VRSKLPNSVLGKIWKLADIDKDGMLDDDEFALANHLIKVKLEGH 
EL PNE L P AHLL P P S KRKVAE 


7023 


2 


748 


AMVFGGWPYVPQYRDIRRTQNADGFSTYVCLVLLVANILRILF 
WFGRRFESPLLWQSAIMILTMLLMLKLCTEVRVANELNARRRSF 
TAADSKDEEVKVAPRRSFLDFDPHHFWQWSSFSDYVQCVLAFTG 
VAGYI TYLS IDSAL FVETLGFLAVLTEAMLG VPQL YRNHRHQST 
EGMS I KMVLMWTSGDAFKTAYFLLKGAPL^FS VCGIjLQVIjVDLA 
ILGQAYAFARHPQKPAPHAVHPTGTKAL 


7024 


1207 


190 


RTGVTGWAQVWMFGGGGVLSSGEQLQMPVKPERGLGPSDGWLV 
S SRRGS PGTVLGLPFWLLTP VLVSRS I RSMLLLTRS PTAWHRLS 
QLKPPVLPGTLGGQALHLRSWLLSRQGPAETGGQGQPQGPGLRT 
RLLITGLFGAGLGGAWLALRAEKERLQQQKRTEALRQAAVGQGD 
FHLLDHRGRARCKADFRGQWVLMYFGFTHCPDICPDELEKLVQV 
VRQLEAEPGLPPVQPVFITVDPERDDVEAMARYVQDFHPRLLGL 
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NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide - " 
<A=Alanine, C=Cysteine , D«Aspartic Acid, E= 
Glutamic Acid, F«Phenylalanine, G=Glycine, 
H«Histidine, I=Isoleucine, K=Lysine, 
L^Leucine, M=Methionine, N=Asparagine, 
PaProline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, +-Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








TGSTKQVAQASHSYRVYYNAGPKDEDQDYIVDHS IAIYLLNPDG 
LFTDYYGRSRSAEQISDSVRRHMAAFRSVLS 


7025 


232 


832 


ERNS P I GNNENL * K \ HSLDCLCFRGD WEGNTQFQTLQDNQEECF 
KQVIRTCEKRPTFNQHTVFNLHQRLNTGDKLNEFKELGKAFISG 
SDHTQHQLIHTSEKFCGDKECGNTFLPDSEVIQYQTVHTVKKTY 
ECKECGKSFSLRSSLTGHKRIHTGEKPFKCKDCGKAFRFHSQLS 
VHKRIHTGEKSYECKECGKAFSCG 


7026 


328 


1146 


NPNPS IGDI KDIKKAAKSMLDPAHKSHFHPVTPSLVFLCFI FDG 
LHQALLSVGVSKRSNTWGNENEERGTPYASRFKDMPNFIALEK 
SSVLRHCCDLLIGVAAGSSDKICTSSLQVQRRFKAMMASIGRLS 
HGESADLLISCNAESAIGWISSRPWVGELMFTFLFGDFESPLHK 
LRKS S * LPRKHR * QP INAVRMFLDQCMDGS IALRAIVS EIPVFE 

EKKNNG*KGIGEIF*WGCTLPPHYWGAVTTNVPKLSNSGKLLG 
QDEQPHIFG 


7027 


43 


954 


GRRLQQQQRPEDAEDGAEGGGKRGEAGWBGGYPEIVKENKLFEH 
YYQELKIVP EGE WGQ FMDALR E P LP ATLR I TG YKSHAKE I LHCL 
KNKYFECELEDLEMDGQKVEVPQPLSWYPEELAWHTNLSRKILRK 
SPHLEKFHQFLVSETESGNISRQEAVSMIPPLLLNVRPHHKILD 
MCAAPGSKTTQLIEMLHADMNVPFPEGFVIANDVDNKRCYLLVH 
QAKRLS S PC I M WNHDAS S I PRLQ I DVDGRKE I LF YDR I LCDVP 
CSGDGTMRKNIDVWKKWTTLNSLQLHGLQLRIATRGAEQL 


7028 


189 


608 


SRPPPEPEPGTMVEKGSDSSSEKGGVPGTPSTQSIiGSRNFIRNS 

KKMQSWYSMLSPTYKQRNEDFRKLFSKLPEAERLIVDYSCALQR 

EILLQGRLYLSENWICFYSNIFRWETTISIQLKEVTCLKKEKTA 
KLIPNAIQ 


7029 


1343 


40 


VLESNTEAKQATGTSSKLRHGTGQEKGREGPRCPSGLAQLRLWG 
/PC PHAGRETG PRAS AP I PGS * GHGWH W * R KDGRGERS EG PSAL 
SPHSPSLLNMQQAPTHVGPGMGSQRPRSSWPEQVGVGSQLSRE 
RWRA*RSLPGAAASERTEMTKERSP/RPCQGYDSSNWFTQPGKK 
TRKRNS RRNTMVS RGGGCLL YPLQS IMPE* QLR * GAHAS PPTQG 
R* G KGG P RS PLT KAS GTTH I PTPFFGS I P/RPTRDSGPGTDNS \ 
AAPGQKRGHREA* QGPEP V/ WGRVTTHLQGPAG * TKPLGS \ RNW 
VPGPAEGEQGEGAGLEGRP*PLKGCRSTLTFSPQLSIPMVGKKP 
PEGTTASFFP\RSCHSE*RKPPPSCPHAPALSLPHPLPLPLPPL 

PLPLPGAGT*HSARSGRPGQSETGSLCHNCHHCPPHCPKCSPGG 
T 


7030 


2 


521 


FVCFSAPGSGQGGKRRVNMELSAVGERVFAAEAliLKRRIRKGRM 
E YLVKWKGWSQKY S TWEP E EN I LDARLLAAFE E RE R EME L YG P K 
KRGPKPKTFLLKAQAKAKAKTYEFRSDSARGIRIPYPGRSPQDL 
ASTSRAREGLRN\RVCPRQRAAPAPAAP\PRRGPSGPGPRPG*G 
PG LH F PG PGGP S KHG FVPAS E QHQHQQHL PRRG PS G PGPRPG 


7031 


960 


59 


HCSVPGAEWPRKPPAQICPQLTSRPHLSSPRSLSPGCGHSPGPG 
/ CKPS / RHCDELHEGPSRTAALPCGKPQPKHGVEECG / PCPCLA 
PRRLTEPPALTVSPVGRAAPSGAL*PSGRACSACSHRLAPEAAL 
SAAAPRPSLGSGQNASGLPAASLPPQDSSQPHKTVP SPARS VP P 
LGAQARAAPPRLWCPRALVSG+EASPEAVSVAAGPPVPGPTPST 
SGSTASHSRRGC*SPR*TPAPPRRDHGRSAAFEVLTAAASAQPC 
ASQGGPRPTGAGRTPSPLGLPFSRGPPAASARPFCRHPSL 


7032 


1393 


2104 


RRPGRTE P VE PP PVP P PPRASNS KSRCR * RNLHLAPL * QS PLR K 
SRQIGTSSLPFGRSAGERPRPAATFCLSRGGSSPVFL+PSSSSL 
EPWMKRQFGRLHSLFWKSWQKMNSFLLTPKLDTSLMSGWRYRQR 
LPRLHTFLKKSLQMASELAPPLPTPAPLASSLPPPPGPPPLLPV 
PLA* LSRSG ILVPPNSGFSLS C\ PLGDH +GSSGEVRGSCGSPPP 
HHCWVLPPPP*LLLPPR 


7033 


. £89 


815 


RSRDCLSSSATSNRARRSKCSGPKRATPLDSGPGP*APPGPSSA 



589 



WO 01/53312 



PCT7US00/34263 



SEQ 
ID 
NO: 


Predicted 
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amino acid 
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amino acid 
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Amino acid segment containing signal peptide " 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=*Isoleucine, KaLysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R^Arginine, 
S=Serine, T=>Threonine , V= Valine, 
W=Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








LMMPSSCPWRTGALGPSPAGSRALGRCTSSVGPGSRWLTRTSSP 
QCATRTWRTMRMEPRPLRSRMGESAPGIPAELPSAAPSGPSAPS 

aaapsapttpaaagpntl*srrtaewcwppscsccwgwc*swsa 
wdwrrpplqvspapssscrasccwclesit*ssstarsratgas 
ssstcptsrsdrgaawtp\spmgapllpcsvplisreealqdpr 
npsp*gvcsgssghaglalgkppvacsvp 


7034 


92 


1942 


EDTSSMPFRLLIPLGuLCALLPQHHGAPGPDGSAPDPAHYRERV" 

kamfyhaydsylenafpfdelrpltcdghdtwgsfsltlidald 
tll\tlfyfqilgnvsefqrwevlqdsvdfdidvnasvfetni 

RVVGGLLSAHLL c ?KKAftVFVFRf^WDf , cr , DT r dm&up&advt t oa 

fqtptgmpygtvnllhgvnpgetpvtctagigtfivefatlssl 

TGDPVFED VARVALMRLWE S RS D IGLVGNH I DVLTGKWVAQDAG 
IGAGVDSYFEYLVKGAILLQDKKLMAMFLEYNKAIRNYTRFDDW 
YL W VQM YKGTVS M P V FQS LE AYW PGLQSLIGDI DNAMRT F LNY Y 
TVWKQFGGLPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRAT 
GDPTLI.ELGRDAVESIBKISKVECGPATIKDLRDHKLDNRMESF 
FLAETVKYLYLLFDPTNFIHNNGSTFDAVITPYGECILGAGGYI 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMREFYSLKRSRSKFQ 
KNTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
QPFTSKLALLGQVFLDSS * PLDNFFI FI FLRLN YNKLLLAI I KK 
K 


7035 


92 


1942 


EDT S S M P FRLL I PLGLLCALLPQHHGA PGPDGS A PD PAHY RER V 
KAMFYHAYDSYLENAFPFDELRPLTCDGHDTWGSFSLTLIDALD 
TLL\TLFYFQILGNVSEFQRWEVLQDSVDFDIDVNASVFETNI 
k v vt^ij ii AH Jj Lb KKAG VE VE AGW P CS G P LLRMAE E AARKLLPA 
FQTPTGMPYGTVNLLHGVNPGETPVTCTAGIGTFIVEFATLSSL 
TGDPVFEDVARVALMRLWESRSDIGLVGNHIDVLTGKWVAQDAG 
njftuvDo xrai ijv JvuM±iji^iJivi^rlAMr XjajTNivAIRNYTRFDDW' 
YLW VQM Y KGT VS M P VFQ S LEAYWPGLQS L I GD I DNAM RT FLNY Y 

TVWKQFGGLPEFYNIPQGYTVEKREGYPLRPELIESAMYLYRAT 
GDPTLLELGRDAVESIEKTQKVFrnFATTK'nT.PnMK'T nMDMuoo 

FLAETVKYLYLLFDPTNFIHNNGSTFDAVirPYGECILGAGGYI 
FNTEAHPIDPAALHCCQRLKEEQWEVEDLMRBFYSLKRSRSKFQ 
KOTVSSGPWEPPARPGTLFSPENHDQARERKPAKQKVPLLSCPS 
QPFTS KLALLGQ VFLDS S * PLDNFFI FI FLRLNYNKLLLAI I KK 
K 


7036 


442 


761 


CLAPLFSCFQI INLHLAPSGRLRWAWLRGPGRN* LPGEGPS I PT 
RNW*ERKAGCSQPC/PAQQHHGRPPGVSPLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7037 


442 


761 


CLAPLFS CFQI INLHLAPSGRLRWAWLRGPGRN* LPGEGPS I PT 
RNW*ERKAGCSQPC/PAQQHHGRPPGVSPLPRDPHPTTLRPLPP 
PPPPPPPPPRRPPRNRRPG 


7038 


155 


891 


ri\WlUvn loiiUJbKKKUKJjUKOAr B.EI XL 
Q YNKLLE KS DLHS VLAQ KLQAE KHD V PNRHE I S PGHDGT WNDNQ 
LQEMAQLRIKHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREM 
QMNEAKI AE CLQT I S DLETECLDLRTKLCDLERANQTLKDEYDA 
LQITFTALEGKLRKTTEENQELVTRWMAEKAQEANRLNARE*KR 
LQEAAS PAAERACRS S KGTSTS RTG 


7039 


155 


891 


GAGAASDMSSGLRAADFPRWKRHI SEQLRRRDRLQRQAFEEI IL 
QYNKLLEKSDLHSVLAQKLQAEKHDVPNRHEISPGHDGTWNDNQ 
LQEMAQLRIKHQEELTELHKKRGELAQ\RVIDLNNQMQRKDREM 
QMNEAKIAECLQTISDLETECLDLRTKLCDLERANQTLKDEYDA 
LQ I TFTALEGKLRKTTEENQELVTRWMAE KAQEANRLNARE * KR 
LQEAAS PAAERACRSSKGTSTSRTG 


7040 


34 


789 


KITPPRRPHRCSSGHGSDNSSVLSGELPPAMGKTALFYHSGGSS 
GYESVMRDSEATGSASSAQDSTSENSSSVGGRCRSLKTPKKRSN 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cyateine, D=Aspartic Acid, E= 
Glutamic Acid. F=Phenvlalanine (5sGl vrinp 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«» Unknown, *=Stop 
Codon, /-possible nucleotide deletion, 
\«possible nucleotide insertion) 








PGSQRRRLIPALSLDTSSPVRKPPNSTGVRWVDGPLRSSPRGLG 
EPFEIKVYEIDDVERLQRRRGGASKEAMCFNAKLKILEHRQQRI 
AEVRAKYEWLMKELEATKQYLMLDPNKWLSEFDLEQVWELDSLE 
YLEALECVTERLESRVNFCKAHLMMITCFDrT 


7041 


1 


567 


SGRVAMGRRRAPAGGSLGRALMRHQTQRSRSHRHTDSWLHTSEL 
NDGYDWGRLNLQSVTEQSSLDDFIATAELAGTEFVAEKLNIKFV 
PAEARTGLLSFEESQR I KKLHEENKQFLCI PRRPNWNQNTTPEE 
LKQAEKDNFLEWRRQL\VRLEEEQKLILTPFERNLDFWRQLWRV 
IERSDIWQIVDA 


7042 


7 


345 


P I HMAAAALRAD I \ I S P LFPH I QG YLLLSASHG \ ATS LHTKGAL 
PLETVTMYTVIPKSKYVLVKPDTQYPYSENLDEFKRLiAENSASN 
DDLLMAEVAISDYGDKLTLELREKY 


7043 


2 


2170 


ARGMAARDSDSEEDLVSYGTGLEPLEEGERPKKPIPLQDQTVRD 
E KGR Y KR FHG AF S GG FS AG Y FNT VGS KEG WTPS T F VS S RQNRAD 
KSVLGPEDFMDEEDLSEFGIAPKAIVTTDDFASKTKDRIREKAR 
QLAAATAP I PGATLLDDLITPAKLS VGFELLRKMGWKEGQGVGP 

RVKRRPRRQKPDPGVKIYGCALPPGSSEGSEGEDDDYLPDNVTF 
APKDVTPVDFTPKDNVWrJT.2WTtt*JT ncunaT T?r ,r Pcr , x?tTTrxTT unnn 

SERAGDLGEIGLNKGRKLGISGQAFGVGALEEEDDDIYATETLS 
KYDWLKDEEPGDGLYGWTAPRQYKNQKESEKDLRYVGKILDGF 
SLASKPLSSKKIYPPPELPRDYRPVHYFRPMVAATSENSHLLQV 
LSESAGKATPDPGTHSICHO'LNASK'RaFT.T.nPTDTnr'caTcvT t?i? 

LSQKDKERIKEMKQATDLKAAQLKARSLAQNAQSSRAQPSPAAA 
AGHCSWNMALGGGTATLKASNFKPFAKDPEKQKRYDEFLVHMKQ 
GQKDALERCLDPSMTEWERGRERDEFARAALLYASSHSTLSSRF 
THAKEEDDS DO VEVPRDOEND VGDKO SAVKMKM FGK LTP. dtpp w 

«e • ww »c fcJ *» J - y v wi/»\^i?ny <\j issa tc wZVU 1 t\.U i. V Ctrl 

HPDKLLFQ/RLVGLPRVKRDKYSVFNFLTLPETASLPTTQASSE 
KVSQHRGPDKSRKPSRWDTSKHEKKEDSISEFLRLARSKAEPPK 
QQSS PLVNKEEEHAPELSAN 


7044 


276 


734 


E VYLTDEFAKGRKVADL YELVOYAGNTI T prt.yt .T . T Tvr \nrv\7v c 
FPQSRKDILKJ3LVEMCRGVQHPLRGLFLRNYLLQCTRNILPDEG 
EPTDEETTGDISDSMDFVLLNFAEMNKLWVRMQHQGHSRDREKR 
ERERQELRILVGTNLVRLSQV 


7045 


3 


513 


LGFKMEALSRAGQEMSLAALKQHDPYITSIADLTGQVAIjYTFCP 
KANQWEKTDIEGTLFVYRRSASPYHGFTIVNRLNMHNLVEPVNK 
DL E FQLHE P F LL YRNAS LS I YS I W F YD KND CHR I AKLMAD WEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7046 


3 


513 


LXjFKMEALSRAGOEMSLAALKOHDPYITSTADLTGnvaT vtet-d 
KANQWEKTD I EGTLFVYRRSASPYHGFTIVNRLNMHNLVE PVNK 
DLE FQ LHE P FLL YRNAS L S I YS I W F YD KND CHR I AKLMAD WEE 
ETRRSQQA/RSGQTESQPGQWLQRPQAHRHPGDAEQSQG 


7047 


103 


486 


QM K I E KCG W S EGLTS I KGN CHN FY TA I S KDVT Y KE LKNLLNS KN 
I ML I D VRE I WE I LE YO K I PE <3 1 NV P T «D WCZF T .dmmd p n tt vp irv 
NEVKPS KSDS / 1 VFSYLAGVRSKKALDTAISLGFHS YYER 


7048 


92 


ten 


FFCLTLLSSWDYRHHATRRVISSPVFTMEDSGKTFSSEEEEANY 
WKDLAMTYKQRAENTQEELREFQEGSREYEAELETQLQQIETRN 
RDLLS ENNRLRMELET I KEKFEVQHS EG YRQI S ALEDDLAQTKA 

IKDQLQKYIRELEQANDDLERAKRATDHGLSKTFE\QRLN\QAI 
EKKW 


7049 


393 


938 


KRTG S AS YGG P P PG LGG P ATXAS VAGR CS S VG K I P ARRC YE D E L 
VPVFEAVGRIYELRLMMDFDGKNRGYAFVMYCHKHEAKRAVREL 
NN YE I R PGRLLG VCCS VDN CRLFI GG I P KMKKREE I LEE I AKVT 

EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKLIAWX 
ASSLWG 


7050 


393 


938 


KRTG S AS YGGPPPGLGGPATXAS VAGRCSS VGKI PARRCYEDEL 
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ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H^Histidine, I^Isoleucine, K=Lysine, 
L-Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
SaSerine, T=Threonine, V«Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /-.possible nucleotide deletion, 
\opossible nucleotide insertion) 








V P VFEAVGR I Y ELRLMMDFDGKNRG YAF VM YCHKHEAKRAVREL 

NNYEIRPGRLLGVCCSVDNCRLFIGGIPKMKKREEILEEIAKVT 

EGVLDVIVYASAADKMKNRGLRLRGVREPPRGCHWLGRKLIAWX 
ASSLWG 


7051 


119 


816 


KKMNIiAEICDNAKKGREYALLGNYDSSMVYYQGVMQQIQRHCQS 
VRDPAIKGKWQQVRQELLEEYEQVKSIVGTLESFKIDKPPDFPV 
SCQDEPFRDPAVWPPPVPAEHRAPPQIRR/RQSRSKTSEERNGR 
S RS PGTCR PS T\ P I S KS EKPS TS R DKD YRARGRDDKGR KNMQDG 

ASDGEMPKFDGAGYDKDLVEALERDIVSRNPSIHWDDIADLEEA 
KKLLREAGVLPMWM 


7052 


467 


715 


SCPGRGKMSKLLNPEEMTSRDYYFDSYAHFGIHEEMLKDEVRTL 
TYRNSMYHNKHVFKDKWLDVGSGTGILSMFAARQGPRR 


7053 


467 


715 


SCPGRGKMSKLLNPEEMTSRDYYFDSYAHFGIHEEMLKDEVRTL""' 
T YRNS MYHNKH VFKD KWLD VGS GTG I LS M FAARQG P RR 


7054 


1 


1036 


GTSQRSRETDARRRSAGAEPTARLPWPAALEEWPSCPCEPLGPG 
RRCRWDAME YDE KLARFRQAHLNPFNKQSG PRQHEQG PGE E VP D 
VTPEEALPELPPGEPEFRCPERVMDLGLSEDHFSRPVGLFLASD 
VQQLRQAIEECKQVILELPEQSEKQKDAWRLIHLRLKLQELKD 
PNEDEPNIRVLLEHRFYKEKSKSVKQTCDKCNTIIWGLIQTWYT 
CTG C Y YRC HS KCLNL I S K P CVS S KVS HQAE YELN I C P ETGLDSQ 
DY RCAECRAP I / CS /DG WPS EAR QCD YTGQ Y YCSHCHWWDLA V 
I PARWHNWDFE PRKVS RCS MRYLALMVSR P VLRLRE IN 


7055 


2 


527 


DSRRVSWRSWLANE/WGKliLCLFIWLSMNVLLFWKTFLLYNQGP 

EYHYLHQMLG/ALCLSRASASVLNLNCSLILLPMCRTLLAYLRG 

SQKVPSRRTRRLLDKSRTFHITCGATICIFSGVHVAAHLVNALN 

FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEWLFL 
M 


7055 


2 


527 


DSRRVSWRSWLANE/WGKHLCLFIWLSMNVLLFWKTFLLYNQGP ' 

EYHYLHQMLG/ALCLSRASASVLNLNCSLILLPMCRTLLAYLRG 

SQKVPSRRTRRLLDKSRTFHITCGATICIFSGVHVAAHLVNALN 

FSVNYSEDFVELNAARYRDEDPRKLLFTTVPGLTGVCMEWLFL 

M 


7057 


1368 


431 

r 


GIYLIIVNEKIPRPTCIGDRQENDKENLNLENHRDQELLHASCQA 
SGEVPSQASLRGFFTEDEPGCFGEGENLPEALQNIQDEGTGEQL 
SPQERISEKQLGQHLPNPHSGEMSTMWLEEKRETSQKGQPRAPM 
AQKLPTCRECGKTFYRNSQLIFHQRTHTGETYFQCTICKKAFLR 
SSDFVKHQRTHTGEKPCKCDYCGKGFSDFSGLRHHEKIHTGEKP 
YKCPICEKSFIQRSNFNRHQRVHTGEKPYKCSHCGKSFSWSSSL 
DKHQRSHLGKKPFQ+PVTKLSFPISISOPSHKNTQLHQEELCLR 
GYPC 


7058 


1 


469 


FS G FGAVPD AIX3 CRMS D LR I TE AFLYMD YLC FRALC C KG P P PAR 
PEYDLVCIGLTGSGKTSLLSKLCSESPDNWSTTGFSIKAVPFQ 
NAILNVKELGGADNIRKYWSRYYQGSQGVIFVLDSASSEDDLEA 
ARN*SCTQLLQHPQLCTLPFLILA 


7059 


1 


1178 


WPAFPRQPAAAAMDALLGTGPRRARGCriGAAGPTSSGRAARTPA " 

APWARFSAWLECVCWTFDLELGQALELVYPNDFRLTDKEKSSI 

CYLSFPDSHSGCLGDTQFSFRMRQCGGQRSPWHADDRHYNSRAP 

VALQRE P AH Y FG Y VY FRQ VKDS S V KRG Y FQ KS L VL VS RL PFVRL 

FQAIjLSIiIAPEYFDKLAPCLEAVCSEIDQWPAPAPGQTLNLPVM 

GWVQVR I PSRVDKSES S PPKQFDQENLLPAP WtAS VHELDLF 

RCFRPVLTHMQTLWELMLLGEPLLVLiAPSPDVSSEMVLAl»TSCL 

QPLRFCCDFRPYFTIHDSEFKEFTTRTQAPPNWLGVTNPFFIK 

TLQHWPHILRVGEPKMSGDLPKQVKLKKPFKV*RPWDTKP 


7060" ' 


90 


1670 


SVNLPPSLWPWEEAMDSTKSEPLKGSPEAEDGNIEYKKLVNPSQ"'" 
YRFEHLVTQMKWRLQEGRGEAVYQIGVEDNGLLVGLAEEEMRAS | 
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ID 
NO: 
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beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, OCysteine, D=Aspartic Acid, E« 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
Jj=Leucine, M-Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V»Valine, 
W^Tryptophan, Y«Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion 
\=possible nucleotide insertion) 








LKTLHRMAEKVGADITVLREREVDYDSDMPRKITEVLVRKVPDN" 
QQFLDLRVAVLGNVDSGKSTLLGVLTQGELDNGRGRARLNLFRH 
LHE I QSGRTS S IS FE I LGFNS KGE VHG INGTQWGQTLRMGW * * * 
RT*DGGRVWRLFEIV*MNALRGL*TSSAPLRKSMGNQLN+IKNG 
VKI KRQGHPGNGLGPGNSEG VGRAGRRH * GPWALGQWNYSDS R 
TAEEICESSSKMITFIDLAGHHKYLHTTIFGLTSYCPDCALLLV 
S ANTG I AGTTREHLGliAIALKVP F F I WS K I DLCAKTT VERTVR 
QLERVLKQPGCHKVPMLVTSEDDAVTAAQQFAQSPNVTPIFTLS 
SVSGESLDIiLKVFLNILPPLTNSKEQEELMQQLTEFQVDEIYTV 
PEVGTWGGTLSR* IDLLATLPTQPSPIYSKTSWPKGGDPGI 


7061 


364 


710 


ARMPS PLG PPCLPVMDPETTLEEPETARLPFRGFrYr>Pvar:pi?P 
ALARLRELCCQWLQPEAHSKEQMLEMLVLEQFLGTLPPEIQAWV 
RGQR PGS PEEAAAL VEGLQHDP *ARMPS PLGPPCLP VMDPETTL 
EE P ETAR LRFRG FC YQE VAGPRE ALARLRELCCQWLQPE AHS KE 
QMLEML VLEO FLGT L P P E I OAW VRGQR PGS P E EA A AI» VEGLQHD 
PGQLLG 


7062 


71 


744 


AKAGTNL E RLH WLS YFFCIPKHKLKSSQKD KVRQ FMACTQ AG ER 
TAIYCLTQNEWRLDEATDSFFQNPDSLHRESMRNAVDKKKLERL 
YGRYKDPQDENKIGVDGIQQFCDDLSLDPASISVLVIAWKFRAA 
TQCEFSRKEFLDGMTELGCDSMEKLKALLPRLEQELKDTAKFKD 
FYQFTFTFAKNPGOKGLDL*MAGAYWKTjVL^GRP'K'T5'T.YT MMT77T 
MEHH 


7063 


2 


562 


LRTVPDLPGRRFRAMRTGQRR * PELPPDMNSLEQAEDLKAFERR 
LTEYIHCLQ P ATGRWRM L L I WS VCTATG AWN WL I D P ETQ KVS F 
FTSLWNHPFFTISCITLIGLFFAGIHKRWAPSIIAARCRTVLA 
E YNMS CD DTG KL I LKPR PHVO * OS S L I VMGLKT A FT ,R T <? nT A w q 
HKGFLLRLDM 


7064 


300 


884 


RDTGSDPSSTRRLCSTCCTGH*PAEPIASPHPSRGTCPPASSAS 
S RRTGCWTC P PE SGHAQARRSRRAS ASRWGARGAVRS AVAARGC 
SSRAGRWLETPGRRRGPPACAAAAGRLRGPAP*AAPPTASVPAR 
CRCPAARTGAPAAATWLRRRLSGLRAPALGRRRSPGPSPKSAAP 
PLLTPLGAGRAGGSRANS 


7065 


1 


555 


ATTTHSARRSGRGAAAEAAASAAGGRQKGPDRKAWEGRRTTPGG 
RSQSEPKAPPPQKRSEAAFASMAHSPVAVQVPGMQNNIADPEEL 
FTKLERIGKGSFGEVFKGIDNRTQQWAIKIIDLEEAEDEIEDI 
QQE I T VLS Q CDS S YVTKYYG S YLKG S KLW I I ME YLGGGS ALDLL 
RAGPFDEFQ 


7066 


356 


676 


PG PQRG PWRARE GGHP LDPADH PRAP AS LRSNVRAATMMQ I CDT " 
YNQKH S LFNAMNR F IG AVNNMDQTVMV PSLLRD VP LAD PGLDND 
VGVEVGGSGGCLEERTPP 


7067 


152 


973 


KENITMATEIGSPPRFFHMPRFnHOAPROT.FYTfPDnpannnaMn 
QLTFDGKRMRKAVNRKTIDYNPSVIKYLENRIWQRDQRDMRAIQ 
PDAG Y YNDL VP P IGMLNNPMNAVTTKFVRTSTNKVKCPVF WRW 
T P EGRRLVTGAS S G EFTLWNGLTFN FETI LQAHDS P VRAMT WSH 
NDM WMLTADHGG YVKY WQSNMNNVKMFQAHKEAI REARF I HN I P 
FSWPIVMVKLFSKCILGAEMHGLCQFLGNFLHPINTIFFFVFT 
HSPFCWAPF 


7068 


222 


816 


DTMKEYVLLLFLALCSAKPFFSPSHIALKNMMLKDMEDTDDDDD 
DDDDDDDDDDEDNSLFPTREPRSHFFPFDLFPMCPFGCQCYSRV 
VHCSDLGLTSVPTN I PFDTRMLDLQNNKI KEI KENDFKGLTSLY 
GLILNNNKLTKIHPKAFLTTKKLRRLYLSHNQLSEIPLNLPKSL 
AELRIHENKVKKIQKDTFKKK 


7049 


114 7 


1765 


FRDHRRYFYVNEQSGESQWEFPDGEEEEEESQAQENRDETLAKQ 
TLKDKTGTDSNSTESSETSTGSLCKESFSGQVSSSSLMPLTPFW 
TLLQSNVPVLQPPLPLEMPPPPPPPPESPPPPPPPPPAPKMPPP 
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ID 

NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, ^Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F- Phenyl alanine, G=Glycine, 
H*=Histidine, I=Isoleucine, K»Lysine, 
L=Leucine, M=Methionine , N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, v= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








EKTKKGRKDKAKKSKTKMPSLVKKWQSIQRELDEEDNSSSSEED 
RVSTAQKRIEEWKQQQLVSGMAERNANFEA 


7070 


1 


547 


DGTMEDSEAVQRATALIEQRLAQEEENEKLRGDARQKLPMDLLV 

LEDEKHHGAQSAALQKVKGQERVRKTSLDLRREIIDVGGIQNLI 

ELRKXRKQKKRDALAASHEPPPEPEEITGPVDEETFLKAAVEGK 

MKVIEKFLADGGSADTCDQFRRTALHRASLEGHMEILEKLLDNG 
ATVDFQ 


7071 


2 


921 


ARGTLRALETAKKVGKVGANGQKAAGPSADSVTENKIGSPPKTP ' 

VSNVAATSAGPSNVGTELNSVPQKSSPFLTRVPAYPPHSENIQY 

FQDPRTQIPFEVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSNNV 

PESSLPPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 

PSGMYAPVYDSRRIWRPPMYQRDDIIRSNSLPPMDVMHSSVYQT 

SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 

IRRKPDQWAQYHTQKAPLVSSTLPVATQSPTPPSTLNRGEGS 


7072 


2 


921 


ARGTLRALE T AKKVG KVG ANGQKAAG P S ADS VTENKI GS P PKTP 
VSNVAATSAGPSNVGTELNS VPQKS S P FLTR VPAY PPHSENX Q Y 
FQDPRTQIPFEVPQYPQTGYYPPPPTVPAGVAPCVPRFVRSNNV 
PESSLPPASMPYADHYSTFSPRDRMNSSPYQPPPPQPYGPVPPV 
PSGMYAPVYDSRRIWRPPMYQRDDIIRSNSLPPMDVMHSSVYQT 
SLRERYNSLDGYYSVACQPPSEPRTTVPLPREPCGHLKTSCEEQ 
IRRKPDQWAQ YHTQKAPLVS S TLPVATQS PTPPSTLNRGEGS 


7073 


50 


504 


LAHGSFGVSDFPAPAAAPAHTLTSFSGSLSPQFRKPLGRAPAMP 
L VR YR K W I LG Y R C VGKTS LAHQFVEGE FS EG YD P TVENT YS KI 
VTLG KDE FHLH L VDTAGQD EYSILPYSFII GVHG YVL VYS VTS L 
HSFQVIESLYQKLHEGHGK 


7074 


263 


1003 


VCP VL CS TRQEPGHS S LVT YFG KPTRRKE FLLGHC f AAGKMN I S 
VDLETNYAELVLDVGRVTLGENSRKKMKDCKLRKKQNERVSRAM 
CALLNSGGGVIKAEIENEDYSYTKDGIGLDLENSFSNILLFVPE 
YLDFMQNGN Y FL I FVKS WS LNTSGLR I TTL S SNL YKRD I TS AK V 
MNATAALEFLKDMKKTRGRLYLRPELLAKRPRVDIQEENNMKAL 
AGVFFDRTELDRKEKLTFTESTHVEI 


7075 


598 


1005 


NYINFFFRKEYPPHVQKVEINPVRLSRLQGVERIMKKTEESESQ " 
VEPE I KRKVQQKRHCS T YQPTP PLS PAS KKCLTHLEDLQRNCRQ 

AITLNESTGPLLRTSIHQNSGGQKSQNTGLTTKKFYGNNVEKVP 
IDII 


7075 


279 


1049 


LQSESSNAAEGNEQRHEDEQRSKRGGWSKGRKRKKPLRDSNAPK 
SPLTGYVRFMNERREQLRAKRPEVPFPEITRMLGNEWSKLPPEE 
KQRYLDEADRDKERYMKELEQYQKTEAYKVFSRKTQDRQKGKSH 
RQDAARQATHDHE KETE VKERS VFD I P I FTEE FLNHS KAR E AE L 
RQLRKSNMEFEERNAALQKHVESMRTAVEKLEVDVIQERSRNTV 
LQQHLETLRQVLTSSFASMPLPEXGETPTVDTIDSYM 


7077 


3 


1119 


SSMGSNSEINGLALRKTDKYGFLGGSQYSGSLKSSIPVDVARQR " 
ELKWLDMFSNWDKWLSRRFQKVKLRCRKGIPSSLRAKAWQYLSN 
S KELLEQNPR KFE ELE RAPGD P KWLD V I E KDLHRQ F P FHEM F AA 
RGGHGQQDLYRILKAYTIYRPDEGYCQAQAPVAAVLLMHMPAEQ 
AFWCLVQI CDKYLPGYYSAGLE AIQLDGE 1 FFALLRRAS PLAHR 
HLRRQRIDPVLYMTEWFMCIFARTLPWASVLRVWDMFFCEGVKI 
IFRVALVLLRHTLGSVEKLRSCQGMYETMEQLRNLPQQCMQEDF 
LVHE VTNL P VTE AL I ER ENAAQL KKWRE TRG E LQ YR P S RRLHGS 
RAIHEERRRQQPPLGPSSS 


7078 


483 


767 


FQGQRMAGEQKPSSNLLEQFILLAKGTSGSALTALISQVLEAPG 
VYVFGELLELANVQELAEGANAAYLQLLNLFAYGTYPDYIANKE 
S LP ELY 


7079 


2 


376 


SWEFKRPKEPSGSDGESDGPIDVGQEGQLSQMARPLSTPSSSQ 
MQARKKRRG I I EKRRRDR I NSSLS ELRRLVPTAFEKQGS S KLEK 
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nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine , N=Asparagine, 
P^Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine , V=Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *«Stop 
Codon, /=possible nucleotide deletion, 
\°possible nucleotide insertion) 








AEVLQMTVDHLKMLHATGGTGTHALLFQAS FIQQI F 


7080 


200 


595 


VQLPLEAPCLSLLSCRDHSGGNRDLSRRHRDCRVYGSPQDGIPY 
LTHPLCHQDWSVGRLQIRALATPGHTQGHLVYLLDGEPYKGPS 
CLFSGDLLFLSGCGEFPRKREELGEEGBTEVRAATVPWRALKP 


7081 


213 


soe 


AVTEEEMILNS LSLCYHNKLI LAPMVRVGTLPMRLLALDYGADI 
VYCEELIDLKMIQCKRWNEVLSTVDFVAPDDRWFRTCEREQN 
RWFQMGTS 


7082 


3 


1137 


APSRNTMLMAWCRGPVLLCLRQGLGTNSFLHGLGQEf>FEGAkSE~ 
CCRS SPRDLRDGEREHEAAQRKAPGAES CPSLPLS I SD IGTGCL 
SSLENLRLPTLREESSPRELEDSSGDQGRCGPTHQGSEDPSMLS 
QAQSATEVEERHVSPSCSTSRERPFQAGELILAETGEGETKFKK 
LFRLNNFGLLNSNWGAVPFGKIVGKFPGQILRSSFGKQYMLRRP 
ALED Y WLMKRGTAI TFP KDINM I LSMMD INPGDTVLEAGS GSG 
GMSLFLSKAVGSQGRVISFEVRKDHHDLAKKNYKHWRDSWKLSH 
VEE W PDNVDF I HKD I SGATED I KSLTFDAVALDMLN PHVTLPVF 
YPHLKHGGVCPVYWNITQVIELLD 


7083 


115 


541 


R S NAVQLTRME YAM KS L S L L Y P KS L S RH VS VRT S WTQQLLS E P 
SPKAPRARPCRVSTADRSVRKGIMAYSLEDLLLKVRDTLMLADK 
P FFL VLE E DG TT VETE E Y FQALAGDT VFMVLQ KGQKWQ P P S EQG 
TRHPLSLSHK 


7084 
* 


3 


522 


NS VS VSSQSRFLAS VPGTGVQRS AAADMAASTAAGKQR I PKVAK 
VKNKAPAEVQITAEQLLREAKEREIjELiLPPPPQQKITDEEELND 
YKLRKRKTFEDNTRKNPTVTQNWT wanWT7i?OT vpTnoiooTvui 

RALD VD YRN I TLWLKY AE MEMKNRQ VNHARN I WDRA I TTL 


7085 


243 


1499 


RQIiARLRRRGWRSPFGGAPMAHITINQYLQQVYEAIDSRDGASC"" 
AELVSFKHPHVANPRT.OMA^PPPKfonVT.'R'DPvnpMwaauT dpt 

YAVGNHDF I EA Y KCQT V I VQS FLRAFQAH KE ENWALP VM YAVAL 
DLRVFANNADQQLVKKGKSKVGDMLEKAAELLMSCFRVCASDTR 
AGIEDSKKWGMLFLVNQLFKIYFKINKXiHLCKPLIRAIDSSNLK 
DDYSTAQRVTYKYYVGRKAMFDSDFKQAEEYLSFAFEHCHRSSQ 
KNKRMI LI YLLP VKMLLGHM PTVE LLKKYHLMQFAEVTRAVS EG 
NLLLLHEALAKHEAFFI RCG I FLI LEKLKI ITYRNLFKKVYLLL 
KTHQLS LDAFLVALKFMQVEDVD I DEVQCI LANL I YMGHVKG Y I 
SHQHQKLWSKQNPFPPLSTGC 


7086 


256 


525 


I LAARMG KQNS KLR PE VMQD LLES TD FTEHE I Q E W YKG FLRD CP 
SGHLSMEEFKKIYGNFFPYGDASKFAEHVFRTFDANGDGTIDFR 
EF 


7087 


166 


723 


LSGSSAGKVAAPCVPPSNHELVPITTENAPKNVVDKGEGASRGG 
NTR KS LE DNGS TR VTPS VQ P HLQ P I RNMS VS RTM EDS CE LDLVY 
VTERIIAVSFPSTANEENFRSNLREVAQ'MLKSKHGGNYLLFNLS 
ERRPDITKLHAKVLEFGWPDLHTPALEKICSICKAMDTWLNAHP 
HRCRVLHNKG 


7088 


104 


759 


GTSAAS PSSLLEI4AGEITETGELYSS YVGLVYMFNLI VGTGALT " 
MPKAFATAGWLVSLVLLVFLGFMSFMTTTFVIEAMAAANAQLHW 
KRMENLKE EEDDDSS TASDSDVLI RDNYERAE KRP I LSVQRRGS 
PNPFEITDRVEMGQMASMFFNKVGVNLFYFCIIVYLYGDLAIYA 
AAVP FS LMQVTCSATGNDS CG VEADTKYNDTDRC WG PLRRVD 


7089 


33 


1775 


SVCWEDRYLKARMEESPLSRAPSRGGVNFLNVARTYIPNTKVEC 
HYTLP PGTMPS AS DWIGI FKVE AACVRDYHTFVWSS VPESTTDG 
SPIHTSVQFQASYLPKPGAQLYQFRYVNRQGQVCGQSPPFQFRE 
PRPMDELVTLEEADGGSD I LLWPKATVLQNQLDESQQERNDLM 
QLKLQLEGQVTELRSRVQELERALATARQEOTELMEQYKGISRS 
HGE ITEERDILSRQQGDHVAR I LELEDD I QTI S EKVLTKEVELD 
RLRDTVKALTREQEKLLGQLKEVQADKEQSEAELQVAQQENHHL 
NLDLKEAKSWQEEQSAQAQRLKDKVAQMKDTLGQAQQRVAELEP 
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Predicted 
beginning 
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location 
corresponding 
to first 
amino acid 
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amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
eequence 


Amino acid segment containing signal peptide 
(A^Alanine, OCysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 

Ij=LeUCine . M=MethioniriP M-Rfinaraainp 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X«=Unknown, *=Stop 
Codon, /"possible nucleotide deletion, 
\=possible nucleotide insertion) 








L KEQ LRGAQ E LAAS S QQ KATLLG E E LAS AAAARDRT I AELHR S R 
LEVAEVNGKLAELGLHLKEEKCQWSKERAGLLQSVEAEKDKILK 
LSAEILRLEKAVQEERTQNQVFKTELARBKDSSLVQLSESKRES 
TELRSALRVLQKEKEQLQEEKQELLEYMRKLEARLEKVADEKWN 
EDATTEDEEAAVGLSCPAALTDSEDESPEDMRLHPMAFVSVETQ 
ASLLLGLE 


7090 


33 


1775 


S VCWEDR YLKARMEES PLS RAPS RGG VNFLNVARTY I PNTK VEC 
HYTLPPGTMPSASDWIGIFKVEAACVRDYHTFVWSSVPESTTDG 

PRPMDELVTLEEADGGSDILLWPKATVLQNQLDESQQERNDLM 
QLKLQLEGQVTELRSRVQELERALATARQEHTELMEQYKGISRS 
HGEITEERDILSRQQGDHVARILELEDDIQTISEKVLTKEVELD 
RLRDTVKALTREQEKLLGQLKEVQADKEQSEAELQVAQQENHHL 
NLDLKEAKSWQEEQSAQAQRLKDKVAQMKDTLGQAQQRVAELEP 
LKEQLRGAQELAASSQQKATLI/3EELASAAAARDRTIAELHRSR 

LSAE I LRLE KAVQ EE RTQNQ V F KTE LARE KDS S L VQL S E S KREL 
TELRSALRVLQKEKEQLQEEKQELLEYMRKLEARLEKVADEKWN 
EDATTEDE E AAVGLS CP AAI/PH Q F DP Q D F HMD T .U DM a vm c ircrn 
ASLLLGLE 


7091 


186 


1076 


EGMLTREHRCGRQRFOFT.FPWDCipv'W'npQrtPtiJT bVMc«Trr> "~ 
EEPADSGQSLVPVYIYSPEYVSMCDSLAKIPKRASMVHSLIEAY 
ALHKQMRIVKPKVASMEEMATFHTDAYLQHLQKVSQEGDDDHPD 
SIEYGLGYDCPATEGIFDYAAAIGGATITAAQCLIDGMCKVAIN 
WSGGWHHAKKDEASGFCYLNDAVLGTlplrrK'FFR'TT vvnr nr u 
HGDGVEDAFS FTS KVMTVS LHKFS PG FFPGTGD VS DVGLG KGR Y 
YS VNVP IQDG I QDEK Y YQ I CER YE PPAPNPGL 


7092 


522 


809 


KQGINEDQEESQKPRLGEGCEPISKRQMKKLIKQKQWEEQRELR 
KQKRKEKRKRKKLEROCOMEPNSDGHDRKRVRRDWHqTT.Pr.TT 
DCSFDXLM 


7093 


454 


655 


NFGVSGVELAQQASMVRMSFVIAACQLVLGLLMTSLTESStQNS 
ECPQLCVCE I RP W FTPQST YREA 


7094 


2 


508 


FVRSMHWGVGFASSRPCWDLSWNQSISFFGWWAGSEEPFSFYG ' 
D 1 1 AF P I^D YGG IMAGLGS DP WWKKTL YLTGGLALLAAAA YLLHE 
LLV I R KQQE IDS KDA 1 1 LHQ FAR PNNG VPS LS P FCLKME TYLRM 
ADLPYQNYFGGKLSAQGKMPWIEYNHEKVSGTEFII 


7095 


1 


411 


IASSLPKMASLLQSDRVLYLVQGEKKVRAPLSQLYFCRYCSELR " 
SLECVSHEVDSHYCPSCLENMPSAEAKLKKNRCANCFDCPGCMH 
TLSTRATS I S TQ L P DDP AKTTMKKAYYLACG FCRWTSRD VGMAD 
KSVGE 


7096 


224 


2067 


ETRSLAVQEKPSQAGRRRSSRISFAGALFLTRFLLQELLLNNFC 
SAMSPAPDAAPAPASISLFDLSADAPVFQGLSLVSHAPGEALAR 
t\ f K I o ^ £> uoLj Ei tit, £s f CiK J\1j1jIJ(j P MD I S S Kii F CS TCDQTFQNHQE 
QREHYKLDWHRFNLKQRLKDKPLLSALDFEKQSSTGDLSSISGS 
EDSDSASEEDLQTLDRERATFEKLSRPPGFYPHRVLFQNAQGQF 
LYAYRCVLG PHQDP P EEAELLLQNLQS XGPRDCWLMAAAGHFA 
GAI FQGR E WTHKTFHR YTVRAKRGTAQGLRDARGGP SHS AGAN 
LRRYNEATLYKDVRDLLAGPSWAKALEEAGTILLRAPRSGRSLF 
FGGKGAPLQRGDPRLWDIPLATRRPTFQELQRVLHKLTTLHVYE 
EDPREAVRLHSPQTHWKTVREERKKPTEEEIRKICRDEKEALGQ 
NEESPKQGSGS EG EDGFQ VE LE L VELTVG TLDLCE S E VL P KRRR 
RKRNKKEKSRDQEAGAHRTLLQQTQEEEPSTQSSQAVAAPLGPL 
LDEAKAPGQPELWNALLAACRAGDVGVLKLQLAPSPADPRVLSL 
LSAPLGSGGFTLLHAAAAAGRGSWRLLLEAGADPTVQCQDH 


7097 


256 


1228 


IRTKSAATWEAWPQCGREGSRIITEPCEANAGSRQELQTERISS 
FLAAQGDQAFHSGLETNNSNSELPLRVGLKVAQGSPLMGGQVSA 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C*Cysteine, D=*Aspartic Acid, E« 
Glutamic Acid, F- Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X*Unknown, *«Stop 
Codon, /".possible nucleotide deletion, 
\»possible nucleotide insertion) 








SNS FSRLHCRNANEDWMSALCPRLWDVPLHHLS I PGSHDTMTYC 
LNKKSPISHEESRL LQLLNKAL P C I TRPWL KW S VTQ ALD VTE Q 
LDAG VR YLDLR I AHMLE GS EKNLHF VHM VYTTAL VE0TLTE I S E 
WLERHPREWILACRNFEGLSEDLHEYLVACIKNIFGDMLCPRG 
EVPTLRQLWSRGQQVIVSYEDESSLRRHHBLWPGVPYWWGNRVK 
TEALIRYLETMKSCGR 


7098 


82 


956 


SSFLKRCRKVLGCWGIPSEQSLFSTLEEPRDKEIDNYCVhJRLQT 
EARSGFWAPNRFPVNICRMTAVDGDRGG^RFTCPrwPUPQT pb 
LVLLLQDWQPGGVG ICTS FLGI S WALLDYHRALRTCLPS KPLLG 
LGSSVIYFLWNLLLLWPRVLAVALFSALFPSYVALHFLGLWLVL 
LLWVWLQGTDFMPDPSSEWLYRVTVATILYFSWFNVAEGRTRGR 
AIIHFAFLLSDSILLVATWVTHSSWLPSGIPLQLWLPVGCGCFF 
LGLALRLVYYHWLHPSCCWKPDPDQVD 


7099 


992 


210 


LFRLAPGFLRSLARQGYHQIWAFPFLPSGATATWPAASRSRSLA 
AR S L PRS PAR P G PNDALLGEHD FRGQG VRAQR FR FS EE PG PGAD 
GAVLEVHVPOIGAGVSLPGILiAAKCGAR\/TT.«?nc;c:PT.DMr , T w\rr* 
RQS CQMNNLPHLQ WGLTWGH I S WDLLALPPQD 1 1 LASD VFFEP 
ED FED I LAT I Y FLMH KN P KVQLW S TY Q VRSADW S LE ALL Y KWDM 
KC VH I PLES FD AD KED I AE S TLPGRHT VEM L V I S FAKDS L 


7100 


205 


671 


ANGGFWEAAPGSEVSLPLWVPTASHSKTTALGIGSAPPPHLSVL 
FLFSFPPQLGDPLEAFPVFKKYDRNGLNVSIECKRVSGLEPATV 
DWAFDLTKTNMQTMYEQSEWGWKDREKREEMTDDRAWYLIAWEN 
SS V P VAFSHFR FDVERGDE VLYW 


7101 


2 


503 


WRGGPRRAKRLAGGAVGWVLLVRGVHSVRAGGGRPPRAADMKKD 
VRILLVGEPRVGKTSLIMSLVSEEFPEEVPPRAEEITIPADVTP 
ERVPTH I VDYS EAEQSDEQLHQE I SQANVI C I VYAVNNKHS I DK 
VTSRWIPLINERTDKDSRLPLILGGNKSDLVBYSR 


7102 


2 


503 


WRGG PRRAKRLAGGAVGWVLLVRGVHS VRAGGGR P PRAADMKKD 
VR I LLVGEPRVGKTSL I MSLVSEEFPEEVP PRAEE I TI PADVTP 
ER V P TH I VD YS EAE OSD EO LHQE I S OANVI C I VYAVNNTttl <? T n K" 
VTS R W I PL INERTD KD S RL P L I LGGNKS DLVE Y S R 


7103 


119 


438 


GSQSSVAVNIRSGTDEESMDLMNGQASSVNIAATASEKSSSSES 
LSDKGSELKKSFDAWFDVLKVTPEEYAGQITLMDVPVFKAIQP 
DELS SCGWNKKEKYS SAP 


7104 


1670 


795 


RLWEHRSVSAGASGWGLSSPGCLLLHPSLPEEERVDILINNAGV 
MRCPHWTTEDGFEMQFGVNHLGEAWAGAAPWVQAILPRRPPKVL 
GF*V*VKSDLFIILNPGHFLLTNLLLDKLKASAPSRIINLSSLA 
HVAGH IDFDDLNWQTRKYNTKAAYCQS \ KLAIVLFTKELSRRLQ 
GSGVTVNALHPGVARTELGRHTGIHGSTFLQHHN\WAHLLAAWS 
KSPRSWPAPAQHNTLAVAEELA\VISGKYFDGLKQKAPAPEAED 
EEVARRLWAESARLVGLEAPSVREQPLPR 


7105 


765 


143 


GQMCRRPSPKSTSCLSMTCDLP/RGLQDPQCLALFRVAVDKHQA 

TiT.K'A2iMQnnrJVmPWT.t7aT.VT'\/'CDI7T UT ocdet TAiraeoniiiAT am 
ijjjxvru-u.'iouu^' v lyivriij r Mij I x voKr Junuyolrr biU Vn.i>&UWyjjo J. 

SQIPVQQMHLFDVHNYPDYVSSGGGFGPADDHGYGVSYIFMGDG 

M I TFH I S S KKS STKTDSHRLGQHI EDALLDVASL FQAGQHFKRR 

FRGSGKENSRHRCGFLSRQTGASKASMTSTDF 


7106 


14 


1064 


GLQ AG H PH P RS ASR I P E ADTH \ YS KLQRAFDS I VNKDHKRM FGT 
Y FRVG F FG S KFG DLDE QE FVY KE P A I TKLP E I SHRLE AF YGQCF 
GAEFVEVIKDSTPVDKTKLDPNKAYIQITFVEPYFDEYEMKDRV 
T Y FE KNFNLRR FM YTT P FT LEGR PRGELHE Q YRRNT VL TTMHAF 
PYIKTRISVIQKEEFVLTPIEVAIEDMKKKTLQLAVAINQEPPD 
AKMLQMVLQGSVGATVNQGPLEVAQVFLAE I PADPKLYRHHNKL 
RLCFKEFIMRCGEAVEKNKRLITADQREYQQELKKNYNKLKBNL 
RPMIERKIPELYKPIFRVESQKRDSFHRSSFRKCETQLSQGS 


7107 


1145 


591 


*I*WLQTGKKK 
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location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
{A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F« Phenyl alanine , G=Glycine, 
H=Histidine, I=Isoleucine , K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyrosine, X^Unknown, *«Stop 
Codon, /^possible nucleotide deletion, . 
\-possible nucleotide insertion) 


7108 


1 


942 


VK VALLLTNLE Q PRTB S E WENS FTLKM FLFQFVNLNSS T FY I AF 
FLGRFTGHPGAYLRLINRWRLEECHPSGCLIDLCMQMGI IMVLK 
QTWNNFMELGYPLIQNWWTRRKVRQEHGPERKI S FPQWEKDYNL 
QPMNAYGLFDEYLEM ILQFGFTTI FVAAFPLAPLLALLNNI I EI 
RLDAYKFVTQWRRPLASRAKDIGIWYGILEGIGILSVITNAFVI 
AITSDFIPRLVYAYKYGPCAGQGEAGQKOJVGYVNASLSVFRIS 
DFENRSEPESDGSEFSGTPLKYCRYRDYRDPPHSLVPYGYTLQF 
WHVLAW 


7109 


964 


102 


WDQRKRNS LVPG PAHG PAQ EE PWE KKE S LGAAQEALS I QLQPKE 
TQPFPKSEQVYLHFLSWTEEX3PEPKDKGSLPQPPITEVESQVF 
S E KLATDTS T F EATS EGTL E LQQRN P KAERLRW S P AQEE S F RQM 
WIHKEIPTGKKDHECSECGKTFIYNSHLWHQRVHSGEKPYKC 
SDCGKTFKQS SNLGQHQR I HTGEKP FE CNECGKAFRWGAHLVQH 
QRIHSGEKPYECNECGKAFSQSSYLSQHRRIHSGEKPFICKECG 
KAYGW CS EL I RHRR VHARKE PSH 


7110 


96 


697 


RLDNFSGFLVEVTKEERHIVKPLYDRYRLVKQMLTRASITPVLG 
SPSTKRRGQMLQPIIEGETAHFFEEIKEEEEDGVNLSSELGDML 
KTAVQVQSSLKNSESDVEENQEKLALDLRLSSSRAASMPELLEQ 
LWKARAEKKKLRKTLREFEEAFYQQNGRNAQKEDRVPVLEEYRE 
YKKIKAKLRLLEVLISKQDSSKSI 


7111 


2 


414 


GSGLYRGPTPGGQCIWKPNSMPPDHERNFGFTQFALELNELTAE 
LKRSLPSTDTRLRPDQRYLEEGNIQAAEAQKRRIEQLQRDRRKV 
MEENN I VHQARFFRRQTDS SGKEWWVTNNTYWRLRAE PG YGNMD 
GAVLW 


7112 


103 


495 


PRCFPVADRGRLIGGLPDWTIMEGKTLNLTCTVFGNPDPEVIW 
FKNDQDI QLSEHFS VKVEQAK YVSMTI KGVTS EDSGKYS INI KN 
KYGG E K I D VT V S VY KHG E K I PDMAP PQQAKP KL I P AS AS AAGQ 


7113 


1 


824 


KCLRQAWHEAPSSLAFTRWCSREERAEGGGNLHRSITRDPKPPG 
LRPSQRPMDDKKKKRSPKPCLAQPAQAPGTLRRVPVPTSHSGSL 
ALGL P HL P S P KQRAKFKR VG KE KGR P VLAGGG S GS AGTPLQH S F 
LTEVTDVYEMEGGLLNLLNDFHSGRLQAFGKECSFEQLEHVREM 
QEKLARLHFSLDVCGEEEDDEEEEDGVTEGLPEEQKKTMADRNL 
DQLLSNLGSCLGALVPGGMRGGEGTYSQSHSWALGEKVGVHGSK 
SSGPLNLPRJR 


7114 


3 


1492 


VWEVDEOI DHYKF SODKFTjWOAAFTfitf FTT.KDT? ^nnwnvr'rx} vr 
IYLNTDFVSVKQRLPKYYSWERCSKHHLNFLGQNRSYVRKKDDG 
CKAYWKVCLHYNLHKAQPAERFFDPNQRGKALHQKQALRKSQRS 
QTGEKLYKCTECGKVFIQKANLWHQRTHTGEKPYECCECAKAF 
SQKSTLIAHQRTHTGEKPYECSECGKTFIQKSTLIKHQRTHTGE 
KP F VCDKC P KAF KS S YHL I RH E KTH I RQAF YKG I KCTTS S L I YQ 
RIHTSEKPQCSEHGKASDEKPSPTKHWRTHTKENIYECSKCGKS 
FRGKSHLSVHQRIHTGEKPYECSICGKTFSGKSHLSVHHRTHTG 
EKP YECRRCGKAFGEKSTL I VHORMHTGEKP YKCNE CGKA F <? F K 
SPLIKHQRIHTGERPYECTDCKKAFSRKSTIilKHQRIHTGEKPY 
KCSECGKAFSVKSTLIVHHRTHTGEKPYECRDCGKAFSGKSTLI 
KHQRSHTGDKNL 


7115 


1 


947 


NAAHGYNWGLWCMYIIPPQDWLDRGDESAPIRTPAMIGCSFWD 
REYFGDIGLLDPGMEVYGGENVKLGMRVWQCGGSMEVLPCSRVA 
HI ERTR KP YNND I D YYAKRNALRAAE VWMDD FKSHVYMAWNI PM 
SNPGVDFGDVSERLALRQRLKCRSFKWYLENVYPEMRVYNNTLT 
YGEVRNSKASAYCLDQGT^EDGDRAHjYPCHGMSSQLVRYSADGL 
LQLGPLGSTAFLPDSKCLVDDGTGRMPTLKXCEDVARPTQRLWD 
FTQSG P I VS RATGR CLE VEM S KDANFGLRL WQRCS GQ KWM I RN 
WIKHARH 


7116 


866 


95 


RVRMRRNAEVIEEKLSMKSWAKFRPGEPWKGYPNIDPETDPYVT 
PGS VINNLS INTVREVDHLRDRNSGSSSSLNTTLPSTSAWSS IR 
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amino acid 
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Predicted end 
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location 
corresponding 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D«Aspartic Acid, E=» 
Glutamic Acid, F«Phenyl alanine, G=Glycine, 
H=Histidine / I=Isoleucine, K=Lysine, 
L=Leucine, M»Methionine , N=Asparaglne, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, VsValine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








ASNYNVPLSSTAQSTSARNSDSKLTWSPGS VTNTSLAHELWKVP " " 
LPPKNITAPSRPPPGLTGQKPPLSTWDNSPLRIGGGWGNSDARY 
TPGSSWGESSSGRITNWLVLKNLTPQIDGSTLRTLCMQHGPLIT 
FHLNLPHGNALVRYSSKEE WKAQKSLHI SDLFLLTL 


7117 


695 


1261 


LLISTPGGCHPPPSSIEFTYTGAWGKALPAPHMPCAPGALPQGA 
FVSQAARAIPLLQPSQAAQAEGLSQPARACGALCSLPWPLRNWG 
S P I LRLPGGLRTPTNDRKTRTRS AMACWARAQWDTLGPLKLSHR 
GKVCLRHPRPTGVRGGPGAAGRQGGMGTRRRGTFTSGARDPGGL 
R VKHRCQ PTGHLP 


7118 


49 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEVEEISLLQPQVE 
ESVLNLGKFHSIVRLVAFCPFASSQVALENANAVSEGWHEDLR 
LLLETHLPSKKKKVLLGVGDPKIGAAIQEBLGYNCQTGGVIAEI 
LRGVRLHFHNLVKGLTDLSACKAQLGLGHSYSRAKVKFNVNRVD 
NMIIQSISLLDQLDKDINTFSMRVREWYGYHFPELVKIINDNAT 
YCRLAQFIGNRRELNEDKLEKLEELTMDGAKAKAILDASRSSMG 
MDISAIDLINIESFSSRWSLSEYRQSLHTYLRSKMSQVAPSLS 
ALIGEAVGARLIAHAGSLTNLAKYPASTVQILGAEKALFRALKT 
RGNTPKYGL I FHS TF I GRAAAKNKGR ISR YLANKCS I ASR I DCF 
SEVPTSVFGEKLREQVEERLSFYETGEIPRKNLDVMKEAMVQAE 
EAAAEITRKLEKQEKKRLKKEKKRLAALALASSENSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKSFSKEEL 
MSSDLEETAGSTSIPKRKKSTPKEETVNDPEEAGHRSGSKKKRK 
FSKEEPVSSGPEEAAGKSSSKKKKKFHKASQED 


7119 


49 


1863 


PHCEPNPGAGAMVLLHVLFEHAVGYALLALKEVEEISLLQPQVE 
ES VLNLGKFHS I VRLVAFCP FASSQVALENANAVS EG WHEDLR 
LLL ETHL PS KKKKVLLG VGDP K I GAA I OEF T ,R YNmTRf? V T a T? T 
LRGVRLHFHNLVKGLTDLSACKAQLGLGHSYSRAKVKFNVNRVD 
NMI IQS ISLLDQLDKDINTFSMRVREWYGYHFPELVKI INDNAT 
YCRLAQ F I GNRRE LNEDKLE KLE ELTMDGAKAKA I LDAS RS S MG 
MDISAIDLINIESFSSRWSLSEYRQSLHTYLRSKMSQVAPSLS 
AL I G EAVGARL I AHAG S LTNLAKY PAS TVQ I LGAE KAL FRAL KT 
RGNTPKYGLIFHSTFIGRAAAKNKGRISRYLANKCSIASRIDCF 
SEVPTSVFGEKliREQVEERLSFYETGEIPRKNLDVMKEAMVQAE 
EAAAEITRKLEKQEKKRLKKEKKRLAALALASSENSSSTPEECE 
EMSEKPKKKKKQKPQEVPQENGMEDPSISFSKPKKKKSFSKEEL 
MSSDLEETAGSTSIPKRKKSTPKEETVNDPEEAGHRSGSKKKRK 
FS KE E P VS S G P E E AAGKS S S KKKKKFHKASQE D 


7120 


1991 


64 


QLGTRRCLRGDKVTNAMQDFLVTNLEPRFIEPQTANLSWFKDS 
NSTTPLIFVLSPGTDPAADLYKFAEEMKFSKKLSAISLGQGQGP 
RAEAMMRSS IERGKWVFFQNCHLAPS WMPALERL I EHINPDKVH 
RDFRLWLTSLPSNKFPVS I LQNGSKMTI EPPRGVRANLLKS YSS 
LGEDFLNSCHKVMEFKSLLLSLCLFHGNALERRKFGPLGFNIPY 
EFTDGDLRICISQLKMFLDEYDDIPYKVLKYTAGEINYGGRVTD 
DWDRRC I MNILEDFYNPDVLS PEHS YSASGI YHQ I PPTYDLHGY 
L.SYIKSLPLNDMPEIFGLHDNANITFAQNETFALLGTIIQLQPK 
S SSAGS QGREE I VEDVTQNI LLKVPEPINLQWVMAKYP VIjYEES 
MNTVLVQ E VI R YNRL LQV I TQTLQDLL KALKGL WM S S QLE LMA 
ASLYNNTVPELWSAKAYPSLKPLSSWVMDLLQRLDFLQAWIQDG 
IPAVFWISGFFFPQAFLTGTLQNFARKFVISIDTISFDFKVMFE 
APSELTQRPQVGCYIHGLFLEGARWDPEAFQLAESQPKELYTEM 
AVIWLLPTPNRKAQDQDFYLCPIYKTLTRAGTLSTTGHSTNYVI 
AVE I PTHQ PQRHW I KRGVAL I CALDY 


7121 


2 


54 6 


RPLRPWVLSLGSMVGLMTYGRRQFQSLDTTMRRLI PPFREASAK 
LTTLVDADAEAFTAYLEAMRLP KNTP E E KDRRTAALQEGLRRAV 
SVPLTLAETVASLWPALQELARCGNLACRSDLQVAAKALEMGVF 
GAYFNVL I NLRD I TDEAFKDQIHHRVS S LLQE AKTQAALVLDCL 
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amino acid 
sequence 


Amino acid segment containing signal peptide 
(A«Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, ^Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine , 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y^Tyrosine, X«Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








ETRQE 


7122 


2 


546 


R P LR P WVLS LG S MVGLMT YGRRQ FQS LDTTMRRLI P P FREASAK 
LTTLVDADAEAFTAYLEAMRXPKNTPEEKDRRTAALQEGLRRAV 

GAYFNVLINLRD I TDEAFKDQIHHRVSSLLQEAKTQAALVLDCL 
ETRQE 


7123 


1 


1092 


kPAVPEARSAGTSEAGR^GAERU^rri^vcirinrrnnMDT tdd&t ro ~ 

AAQAAWRENFPLCGRDVARWFPGHMAKGLKKMQSSLKLVDCIIE 
VHDARIPLSGRNPLFQETLGLKPHLLVLNKMDLADLTBQQKIMQ 
HLEGEGLKNVI FTNCVKDENVKQ I I PMVTELIGRSHRYHRKENL 
EYCIMVIGVPNVGKSSLINSLRRQHLRKGKATRVGGEPGITRAV 
MSKIQVSERPLMFLLDTPGVIAPRIESVETGLKLALCGTVLDHL 
VGEETMADYLLYTLNKHQRFGYVQHYGLGSACDNVERVLKSVAV 
KLG KTQ KVKVLTGTGNVNV I Q PN Y P AAARD FLQT FRRG L LG S VM 
LDLD VLRGH P R V 


7124 


2 


382 


LPLTLLLAAPFAHLLLPPGHDQSPCWHPGPALSPGTLGPLSWAM 
ANSGLQLLGYFLALGGWVGIIASTALPQWKQSSYAGDASIQLRS 
KVFVLESEWGGDSLGLPRDCGWSCLLHSAVRSEKGFWS 


7125 


166 


1127 


NCISEKRNYSFSMQKGKGRTSRIRRRKLCGSSESRGVNESHKSE 
F I ELR KWLKAR KFQDSNLAPACFPGTGRGLMSQTS LQEGQMI I S 
i-trao\^uui. \t\U i viKoIiAjHi 1 1 J\WKPfc'PbrIjIjAL>CTFLVSEKH 
AGHRSLLEA\YLEILPKAYTCPVCLEPSWNLLPKSLKAKAEEQ 
RAHVQEFFASSRDFFSSLQPLFAEAVDSIFSYSALLWAWCTVNT 
RA V YL \ S PG SGNAFLQS RTP VQLAP YLD LLNHS PHVQVKAAFNE 
ETHS YE I RT TS RWR KHE E VF I C YG PHDNQRL FLE YG F VS VHN PH 
ACVYVSRGWNQLCS 


7126 


1 


733 


CRDMAAFIVPSPARRCSQKGSLGHLPTQPWLWAAMSPRGQERGT ' 
SHSQAREPQRPGRWLLGSLQSSPGTLGQAGTASRRRGCMVQRWV 

WVttl VjKtCH V \J V ir A-VjAlXjiaAJjOE. 1 iilrVjAJb K.CjMoGGAGGCWALGWA 

PSPVLPSWLLEGPPPWLSIISDSGTQRPSPRRCPARPSPWGPQC 
WRGGRIASAEASST*TPGSGSRARSGRRSPGSRRRSASAPSPTP 
P TDACA * S CVARP AG SR S S R P AAA 


7127 


1311 


277 


GLPAMCST*KAGYYEETEGDCIPKDR* IEKRPFKEI *RRIPRIF 
AKQKQI * S*NSQKIGASEIDRGRKEADCSDAPAAARIGAVSVFR 
RSTQEARVSPRSNAKSANLRAVRAD*WEHFVLLFHTPEQFLAEC 
ICRST* * K* WHQLC* PLSSL*TGLKRKLLL* VLFRI * WLKDCDV 
* FCQKI FATNFCNWQNLIQ* EE* KPVE YSVEN* HIMNLLLPM*L 
CQSSLRDQTIVTWRM*RNYSMFRINMISSL*DGSIHIPLKLHFY 
PALIFTLTVPINSCCQRPLPLFAHQSIKTLASSGSPMIACLRFL 
LVKKRAFIHTPRSPGCSV*CKHVLVKDNKNNCVGSEV 


7128 


2 


5228 


GRVDLWTILLGRSALRELSQIEAELNKHWRRLLEGLSYYKPPSP ' 
SSAEKVKANKDVASPLKELGLRISKFLGLDEEQSVQLIjQCYLQE 
DYRGTRDS VKTVLQDERQSQALI LKI ADYYYEERTCI LRCVLHL 
LTYFQDERHPYRVEYADCVDKLEKELVSKYRQQFEELYKTEAPT 
WETHGNLMTERQVSRWFVQCLREQSMLLEIIFLYYAYFEMAPSD 
LLVLTKMFKEQGFGSRQTNRHLVDETMDPFVDRIGYFSALILVE 
GMDIESLHKCALDDRRELHQFAQIX3LICQDMDCLMLTFGDIPHH 
APVLLAWALLRHTLNPEETSSWRKIGGTAIQLNVFQYLTRLLQ 
SLASGGNDCTTSTACMCVYGLLS FVLTSLELHTLGNQQDI IDTA 
CEVLADPSLPELFWGTEPTSGLGIILDSVCGMFPHLLSPLLQLL 
RAL VSGKS IAKKVYS FLDKMS F YNEL YKHKPHDVT SHEDGTLWR 
RQTPKLL Y PLGGQTNLR I PQGTVGQ VM LDDRAYLVRW E YS YS S W 
TL FTC E I E M LLHWS TAD V I QHCQR VKP 1 1 DL VHKV I S TDLS I A 
DCLLP ITSRI YMLLQRLTTVI S PPVDVI AS CVNCLTVLAARNPA 
KVWTDLRHTGFL PF VAHP VS S LS QMISAEGMNAGGYGNLLMNS E 
QPQGEYGVTIAFLRLITTLVKGQLGSTQSQGLVPCVMFVLKEML 
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amino acid 
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Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F=> Phenylalanine, G=Glycine, 
H=Histidine, Islsoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=.Glutamine, R=Arginine, 
S=Serine, T=Threonine, V= Valine, 
W=Tryptophan, Y=Tyrosine, X=Unknown, *=Stop 
Codon, /^possible nucleotide deletion, 
\=possible nucleotide insertion) 








PSYHKWRYNSHGVREQIGCLILELIHAILNLCHETDLHSSHTPS 
LQFLCICSLAYTEAGQTVINIMGIGVDTIDMVMAAQPRSDGAEG 
QGQGQLL I KTVKLAFS VTNNV I RLKPPSNWS PLEQALS QHGAH 
GNNLIAVLAKYIYHKHDPALPRLA1QLLKRLATVAPMSVYACLG 
NDAAAIRDAFLTRLQSK\IE\DMRIK\VMIL\EFLTVA\VETQP 
GLIELFLNLEVKDG\SDGSKEFSLGMW\SCLHAV/VWEL1DSQQ 
QDRYWCPPLLHRAAIAFLHALWQDRRDSAMLVLRTKPKFWENLT 
S PLFGTLS PPSETSE PS I LETCAL IMKI I CLE I Y YWKGSLDQP 
LKDTLKKFSIEKRFAYWSGYVKSLAVHVAETEGSSCTSLLEYQM 
LVSAWRMLLIIATTHADIMHLTDSVVRRQLFLDVLDGTKALLLV 
PAS VNCLRLGSMKCTLLLILLRQWKRELGSVDE I LGPLTEI LEG 
VLQADQQLMEKTKAKVFSAFITVLQMKEMKVSDIPQYSQLVLNV 
CETLQEEVIALFDQTRHSLALGSATEDKDSMETDDCSRSRHRDQ 
RDGVCVLGLHLAKELCE VDEDGDS WLQVTRRLP I LPTLLTTLE V 
S LRMKQNLHFTEATLHLLLTLARTQQGATAVAG AG I TQS I CL PL 
LSVYQLSTNGTAQTPSASRKSLDAPSWPGVYRLSMSLMEQLLKT 
LR YN FL P EALD FVG VHQERTLQCLNAVRTVQS LACLE EADHTVG 
FILQLSNFMKEWHFHLPQLMRDIQVNLGYLCQACTSFLHSRKML 
QHYLQNKNGDGLPSAV\AQRV\QRPPSAASAAPSSSKQPAADTE 
ASEQQALHTVQYGLLKILSKTLAALRHFTPDVCQILLDQSLDLA 
EYNFLFALSFTTPTFDSEVAPSFGTLLATVNVALNMLGELDKKK 
E P LTQAVG LS TQ AEGTRTLKS LLM FTMENCF YL L I S Q AMR YLRD 
PAVHPRDKQRMKQELSSELSTLLSSLSRYFRRGAPSSPATGVLP 
SPQGKSTSLSKASPESQEPLIQLVQAFVRHMQR 


7129 


1 


1054 


FRRFRWRRRLH*AGPASSAGGSPGEASGTMSGELPPNINIKEPR 
WDQSTFIGRANHFFTVTDPRNILLTNEQLESARKIVHDYRQGIV 
P PGLTENELWRAKYI YDS AFH PDTGEKMI LIGRMS AQ VPMNMT I 
TGCMMTFYRTTPAVLFWQWINQSFNAWNYTNRSGDAPLTVNEL 
GTAYVS ATTGAVAT ALGLNALTKHVS P L I GRF VP FAAVAAANC I 
N I PLMRQRELKVG I P VTDENGNRLGES ANAAKQAI TQVWSRI L 
MAAPGMAIPPFIMNTLEKKAFLKRFPWMSAPIQVGLVGFCLVFA 
TPLCCALFPQKSSMSVTSLEAELQAKIQESHPELRRVYFNKGL 


7130 


2 


780 


HEVPSLQTSDPLPGSVQRCSVWSQPNKENWCQDHLYNSLGRKG "" 

ISAKSQPYHRSQSSSSVLINKSMDSINYPSDVGKQQLLSLHRSS 

RCESHQDLLPDIADSHQQGTEKLSDLTLQDSQKWWNRNLPLN 

AQIATQNYFSNFKETDGDEDDYVEIKSEEDESELELSHNRRRKS 

DSKFVDADFSDNVCSGNTLHSLNSPRTPKKPVNSKLGLSPYLTP 

YNDS DKLND YL WRG P S PNQQN I VQS LRE KFQ CL S S S S FA 


7131 


805 


573 


AAAEGHIE WKFL I EACJCVNPFAKDR WGNI PLDDAVQFNHLE W 
KLLQDYQDS YTLS E TQAEAAAEALS KENLESM V 


7132 


1420 


1087 


IDMLLLSGALVSGPYTLITTAVSADLGTHKSLKGNAHALSTVTA 
IIDGTGSVGAALGPLLAGLLSPSGWSNVFYMLMFADACALLFLI 
RLIHKELSCPGSATGDQVPFKEQ 


7133 


2 


3648 


QOIPGLLPAHGESGDALRKPRLQKPITGHLDDLFFTLYPSLEKF 
EEELLELHVQDHFQEGCGPLDGGALE I LERRLRVGVHNGLGFVQ 
RPQVWLVPEMDVALTRSASFSRKWSSSKTSSGSQALVLRSRL 
RLPEMVGHPAFAVIFQLEYVFSSPAGVDGNAASVTSLSNLACMH 
MVRWAVWNPLLEADSGRVTLPLQGGIQPNPSHCLVYKVPSASMS 
SEEVKQVESGTLRFQFSLGSEEHLDAPTEPVSGPKVERRPSRKP 
PTSPSS PPAPVPRVLAAPQNS P VGPGLS I SQLAAS PRS PTQHCL 
AR PTS QLPHGSQASPAQAQE FPLEAGI SHLEADLS QTSLVLETS 
IAEQLQELPFTPLHAPIWGTQTRSSAGQPSRASMVLLQSSGFP 
E I LDANKQPAEAVSATEPVTFNPQKEESDCLQSNEMVLQFLAFS 
RVAQDCRGTSWPKTVYFTFQFYRFPPATTPRLQLVQLDEAGQPS 
SGALTHILVPVSRDGTFDAGSPGFQLRYMVGPGFLKPGERRCFA 
R YLAVQTLQ I D VWDGDS LLL IGS AAVQMKHLLRQGRPAVQAS HE | 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corre sponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C«Cysteine, D=Aspartic Acid, E= 
Glutamic Acid, F= Phenyl alanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, ! 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valine, 
W=Tryptophan, Y=Tyroeine, X«Unknown, +=Stop 
Codon, /^possible nucleotide deletion, 
\»possible nucleotide insertion) 








LEWATEYEQDNMWSGDMLGFGRVKPIGVHSWKGRLHLTLAN 
VGHPCEQKVRGCSTLPPSRSRV1SNDGASRFSGGSLLTTGSSRR 
KHVVQAQKLADVDSELAAMLLTHARQGKGPQDVSRESDATRRRK 
LERMRSVRLQEAGGDLGRRGTSVLAQQSVRTQHLRDLQVIAAYR 
E RT KAES I AS LLS LAI TTEHTLHATLG VAE FFE FVL KNPHNTQH 
T VT VE I DNP E L S VI VDS QE W RD F KGAAGLHT P VE EDM FH LRGS L 
APQLYLRPHETAHVPFKFQSFSAGQLAMVQASPGLSNEKGMDAV 
SPWKSSAVPTKHAKVLFRASGGKPIAVLCLTVELQPHWDQVFR 
FYHPELSFLKKAIRLPPWHTFPGAPVGMLGEDPPVHVRCSDPNV 
ICETQNVGPGEPRDIFLKVASGPSPEIKDFFVIljYSDRWLATPT 
QTWQVYLHSLQRVDVSCVAGQLTRLSLVLRGTQTVRKVRAFTSH 
PQELKTDPKGVFVLPPRGVQDLHVGVRPLRAGSRFVHLNLVDVD 
CHQLVASWLVCLCCRQPLISKAFEIMLAAGEGKGVNKRITYTNP 
YPSRRTFHLHSDHPELLRFREDSFQVGGGETYTIGLQFAPSQRV 
GEEE I L I Y INDHED KNEE AFC VKVI YQ 


7134 


2115 


1111 


GGEGFSYPPHVGLSLGTPLDPHYVLLEVHYDNPTYEEGLIDNSG 
LR L F Y TMD I RK YDAG VI EAGLWVS LFHTI P PGMPE FQS EGHCTL 
ECLEEALEAEKPSG I HVFAVLLHAHLAGRG I RLRHFRKG KEMKL 
IAYDDDFDFNFQEFQYLKEEQTILPGDNLITECRYNTKDRAEMT 
WGGLSTRSEMCLSYLLYYPRINLTRCASIPDIMEQLQFIGVKEI 
YRPVTTWPFIIKSPKQYKNLSFMDAMNKFKWTKKEGLSFNKLVL 
S L P VNVRCS KTDNAE WS I QGMTAL P P D I ERP Y KAE PL VCGTS S S 
SSLHRDFSINLLVCLLLLSCTLSTKSL 


7135 


2 


2072 


FVPRVTPRSLSLQGPKGESVGSITQPLPSSYLIFRAASESDGRC 
WLDALELALRCSS LLRLGTCKPGRDGEPGTS PDAS PSSLCGLPA 
SATVHPDQDLFPLNGSSLENDAFSDKSERENPEESDTETQDHSR 
KTESGSDQSETPGAPVRRGTTYVEQVQEELGELGEASQVETVSE 
ENKSLMWTLLKQLRPGMDLSRWLPTFVLEPRSFLNKLSDYYYH 
ADLLSRAAVEEDAYSRMKLVLRWYLSGFYKKPKGIKKPYNPILG 
ETFRCCWFHPQTDSRTFYIAEQVSHHPPVSAFHVSNRKDGFCIS 
G S ITAKS R FYGNS L S ALLDG KATLTF LNRAED YTLTMP YAHC KG 
ILYGTMTLELGGKVTI ECAKNNFQAQLEFKLKPFFGGSTS INQI 
SGKITSGEEVLASLSGHWDRDVFIKEEGSGSSALFWTPSGEVRR 
QRLRQHTVPLE EQTELESERLWQHVTRAI S KGDQHRATQEKFAL 
EEAQRQRARERQESLMPWKPQLFHLDPITQEWHYRYEDHSPWDP 
LKDIAQFEQDG I LRTLQQEAVARQTTFLGS PGPRHERSGPDQRL 
RKASDQPSGHSQATESSGSTPESCPELSDEEQDGDFVPGGESPC 
PRCRKEARRLQALHEAILSIREAQQELHRHLSAMLSSTARAAQA 
PTPGLLQS PRSWFLLCVFLACQLFINHILK 


7136 


2 


418 


DFVPS FRR PSGNTS QTVWLLRAATLE KE VAGLREKIHHLDDMLK 
S QQRKVRQMI EQLQNS KAV I QS KDATI QELKEKIAYLEAENLEM 
HDRMEHLIEKQISHGNFSTQARAKTENPGSIRISKPPSPKPMPV 
IRWET 


7137 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GSFKVATQERNPQRAQMRLRRQKKGWPFLGDFLTELQRLDSAI 
PDDLDGNTNKRSKEVRVLQEMQLLQVAAMNYRLRPLEKFVTYFT 
RMEQLS DKE S YKLS CQL EPENP 


713B 


2 


466 


WASGMSTVPGGSRHSLGIQVRGGWGVTGGEEESLTVPVADTWQA 
GSFKVATQERNPQRAQMRLRRQKKGWPFLGDFLTELQRLDSAI 
P DDLDGNTN KRS KE VR VLQEMQLLQVAAMNYRLRPLE K FVT Y FT 
RMEQLSDKESYKLSCQLEPENP 


7139 




357 


SLRNSARGLKMAASAARGAAALRRSINQPVAFVRRIPWTAASSQ 
LKEHFAQFGHVRRCILPFDKETGFHRGLGWVQFSSEEGLRNALQ 
QENHIIDGVKVQVHTRRPKLPQTSDDEKKDF 


7140 


1401 


1357 


RASSLQVLKAWGGLI PSS FQQQHTGQYALEELFDLKVYDCFCSF 
NMNVS LE KQLR PSQ P W PRGKCR KT PGWEEAR P KAQDLRGDLG KT 
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SEQ 
ID 
NO: 


Predicted 
beginning 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to first 
amino acid 
residue of 
amino acid 
sequence 


Amino acid segment containing signal peptide 
(A=Alanine, C=Cysteine, D=Aspartic Acid, E*= 
Glutamic Acid, F=Phenylalanine, G=Glycine, 
H=Histidine, I=Isoleucine, K=Lysine, 
L=Leucine, M=Methionine, N=Asparagine, 
P=Proline, Q^Glutamine, R=Arginine, 
S=Serine, T^Threonine, V=Valine, 
W-Tryptophan, Y-Tyrosine, X= Unknown, *=Stop 
Codon, /=possible nucleotide deletion, 
\=possible nucleotide insertion) 








QAGPAEAHTRGPPRLPAATGCPPHLPGLLSGISVDIDPTGLQSQ 
WTPKGQDPPLMFSEDYQKSLLEQYHLGLDQKLRKYWGELIWNF 
ADFMTNQCG 


7141 


124 


1073 


LDSRSCWLDMEDLEEDVRFIVDETLDFGGLSPSDSREEEDITVL 
VTPEKPLRRGLSHRSDPNAVAPAPQGVRLSLGPLS PEKLEE I LD 
EANRLAAQLEQ CALQDRES AGEGLG PRRVKPS PRRET FVLKDS P 
VRDLLPTVNSLTRSTPS / LKQPDASTPE * * * EGVSQGS PGYI WK 
EALQHEEGVTHLQSVPCIQKPSIFSS\SRSTPPVRGRAGPSGRA 
AASEETRAAKLRGAAAKSSCQLPI PSAI PRPASRMPLTSRSVPP 

GRGALPPDSLSTRKGLPRPSTAGHRVRESGHKVPVSQRLNLPVM 
GATRSNLQPP 


7142 


658 


839 


LIFLMLHMELKMLSSVTLHIRAFLYWICLKPTSCLIFQNVLNLL 
KK * SRAVG WWMCRT/ YS SDLQVGVI KPWLLLGSQDAAHDLDT 
LKKNKVTH I LNVAYG VENAFLSDFT YKS I S I LDLPETN ILSYFP 
ECFEFIEEAKRKDGWLVHCNA 


7143 


3 


773 


SLEMSSDGEPLSRMDSEDSISSTIMDVDSTISSGRSTPAMMNGQ' "" 
GSTTSSSKNIAYNCCWDQCQACFNSSPDLADHIRSIHVDGQRGG 
V F VCL W KG C KVYNT P S TS Q S W LQRHM LTHS GDKP FKC WGG CNA 
SFASQGGLARHVPTHFSQQNSSKVSSQPKAKEESPSKAGMNKRR 
KLKNKRRRSLARPHDFFDAQTLDAIRHRA1CFNLSAHIESLGKG 
ti£>VVbttbl VSIl»LFFQIKYKTIiQKNISTIISKSLKI 


7144 


1 


988 


FRVNMQDGGPSPAEHSKAEESAGMEARFLGLPDAAGSSGPTPAR 
RCPAPRPAGVSYVIRDEVEKYNRNGVNALQLDPALNRLFTAGRD 
SIIRIWSVNQHKQDPYIASMEHHTDWVNDIVLCCNGKTLISASS 
DTTVKVWNAHKGFCMSTLRTHKDYVKALAYAKDKELVASAGLDR 
QI FLWDVNTLTALTASNNTVTTSSLSGNKDS I YSLAMNQLGTI I 
VSGSTEKVLRVWDPRTCAKLMKLKGHTDNVKALLLNRDGTQCLS 
GSSDGTIRLWSLGQQRCIATYRVHDEGVWALQVNDAFTHVYSGG 
RDRKIYCTDLRNPDIRVLICE 



TRADOCS: 14 16260.1(%CSK01 LDOC) 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO:l-1786 and 3573-5358, a mature protein coding portion 
of SEQ ID NO:l-1786 and 3573-5358, an active domain of SEQ ID NO:l-1786 and 
3573-5358, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operatively associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

1 0. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent 
conditions with any one of SEQ ID NO: 1-1786 and 3573-5358. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

• 1 3. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the • 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1; and 

c) detecting said product and thereby the polynucleotide of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

1 6. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 



605 



WO 01/53312 



PCT/US00/34263 



a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 10 is identified. 

1 9. A method of producing the polypeptide of claim 1 0, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from the group consisting of a polynucleotide sequence of SEQ ID NO:l-1786 and 3573- 
5358, a mature protein coding portion of SEQ ID NO: 1-1786 and 3573-5358, an active 
domain of SEQ ID NO:l~1786 and 3573-5358, complementary sequences thereof and a 
polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1-1786 
and 3573-5358, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides SEQ ID NO: 1787 -3572 and 5359-7144, 
the mature protein portion thereof, or the active domain thereof. 

21 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 



22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO:l-1786 and 3573-5358. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 



27. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and a pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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Group II, claim 12 and 28, drawn to antibodies and method of treatment using composition comprising said antibodies. 
Group HI, claims 17-18, drawn to methods of indentifying a binding partner to a polypeptides. 
Group IV, claim 27, drawn to method of treatment using composition comprising polypeptides. 

The inventions listed as Groups I-IV do not relate to a single inventive concept under PCT Rule 13.1 because, udner PCT Rule 13.2, 
they lack the same or corresponding special technical features for the following reasons: Group I encompasses nucleic acids, 
polypeptides expressed thereby, vectors and host cells containg same, respectively, and methods of making as well as the first method 
of use of this jubject matter. Groups II-V all are directed 10 different special technical features as summarized as follows: Group II is 
directed to an antibody and method of treatment using same, which antibody undergoes recognition and binding reactions wherein 
what is bound is different from what is bound by the compositions of Group I. For example, the polypeptides of Group I do not bind 
the polypeptides of Group I as the antibody of Group II does. Identification of binding partner and treatment are clearly different 
special technical features from detection. Group III is directed to the identification of a binding partner of a polypeptide, which is not 
identified in any of the other Groups and thus clearly contains its own special technical feature. Group IV is directed to treatment, 
which is a clearly different methods than the methods in the other Groups. Thus, in summary, each of Groups I-IV are directed to 
different special technical features and thus support this lack of unity. 

Additionally, each of the claims is directed to more than one species of the generic invention. These species are deemed to lack unity 
of invention because they are not so linked as to form a single inventive concept under PCT Rule 13. 1 . In order for more than one 
species to be searched, the appropriate additional search fees must be paid. The species are as follows: The claims include a series of 
polynucleotides and the polypeptides encoded thereby as representde by the sequences of SEQ ID Nos: 1-1786, and 3573-5358. Each 
of these polynucleotide sequences encodes a separate polypeptide and thus represent a separate gene. Therefore, each of these genes 
defines its own special technical feature. In summary, one species is a gene represented by one polynucleotide sequence and one 
polypeptide sequence encoded thereby. 
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